Wednesday, September 11, 2013

Automated Statistical Reports in MS Office

I prefer LyX to Microsoft Office when writing academic works, but sometimes MS Word/Powerpoint is necessary. My one requirement for the workflow is that manual fiddling isn't required. This is a bit challenging because Office products can't show PS/PDF material inline. You can include an EPS as a graphic which can print OK (if you print to a PDF or have a PS printer), which is fine for Word, but will only show an unavoidably ugly "preview" inline which doesn't work for PowerPoint. For presentations, you should output to a raster format (like PNG) or for tables you can possibly output to EMF.

Generating EPS/PNG files from graphics is built into most stats software (R, Stata), so the only hard part is converting tables. I'll assume you can generate these a tex and start there (so as to minimize differences between Office and Latex generated reports). Scripts below were tested on Windows 7 with Cygwin and I had placed the programs in step four in my path.
  1. Export a table to mytable.frag.tex.
  2. Wrap that tex fragment in a minimal "standalone" document.
    • >echo \documentclass[varwidth=true, border=10pt]{standalone} > mytable.tex
    • >echo \begin{document} >> mytable.tex
    • >cat mytable.frag.tex  >> mytable.tex
    • >echo \end{document} >> mytable.tex
  3. Make your table into a "standalone" PDF (the PDF will be just the size of the table). Your TeX distribution will automatically download the 'standalone' package.
    • >pdflatex mytable.tex
  4. Convert the PDF to the output required.
    • >pdftops -eps mytable.pdf mytable.eps
      • Could also do this in Ghostscript like below.
    • >pstoedit -f emf mytable.pdf mytable.emf
      • This will change the fonts on you. If you have Greek letters this conversion might fail in which case you need to add -adt which turns letters into polygons (not perfect but OK). You can also convert with Adobe Illustrator which will also do font conversion.
    • >gswin64c -dSAFER -dBATCH -dNOPAUSE -sDEVICE=pnggray -r600 -sOutputFile=mytable.png mytable.pdf
There are two options for inserting. The first is the easiest uses absolute paths for images so won't work well with sharing the files among other users (e.g. in Dropbox) or moving the project folder.

  • Use normal image insertion but use the dropdown next to "Insert" and select "Insert and Link". Some VBA scripts are available that will convert the absolute path to relative paths (e.g. here).
  • Insert pictures using field codes. Insert -> Quick Parts -> IncludePicture. Then put in a relative path like pics\pic1.png and select "Data not stored with document" (and you probably want to check resizing both horizontally and vertically). If you want to convert to pictures (if you send it out), then remove the "Data not ..." by removing "\d" from the field code, and make a formatting edit (like going to Format Picture and then changing the Layout wrapping style).

Now when the linked files are updated by programs, they will be updated in Office. If the document is open then select an image (or using select all) and press F9.

Edit 2016-01-22: Expanded part about relative paths to images.

Also note that emfs might not display correctly on Mac Powerpoints (same for MathType equations). You can print pptxs to pdfs but then you will lose any animations.

No comments: