Big LyX Sweave output breakthrough

I've tried to set up LyX converters and copiers many times to achieve a particular result. Today I've succeeded for the very first time.

From a single export, I want

  1. pdf output for a lecture
  2. article format output for the same lecture (for people who want to save paper)
  3. The R code that goes with the lecture

I need to have all 3 synchronized, ALL the time. I mirror my work directories to the Web pages, and it is not great when they go out of sync. I've got to do this several times a week, and it is a hassle if I have to remember lots of steps.

Here's the example output

first-R-03.pdf

first-R-03-article.pdf

I suppose you can just go look for  output, I'm testing this on my new elementary R exercises http://pj.freefaculty.org/guides/Rcourse/First-R/

And the R code file that goes with is called first-R-03.R.

In case you have never used Sweave, or LyX, some little explanation is in order. The internal conversion in LyX goes like this. The document I'm editing in LyX is really a noweb "literate" programming file, otherwise known as an Sweave document. That is words and comments with bits of R code chunks buried throughout.  If I were editing raw LaTeX, it would be an Rnw file. LyX dumps out its content and converts it into a proper Rnw file as the first step in the document processing system. That Rnw is converted to LaTeX by R's Sweave command. That sends the R code chunks to R for processing, and output files for the R stuff are dumped out, along with a LaTeX file.  I prefer the command line "R CMD Sweave filename.Rnw". The sweaving produces a tex file, which is then converted to pdf by the usual LaTeX methods.

Students may want to know what R produced the figures in my document, and I need that to be immediately available. So, in addition to generating the pdf output, I also need to extract the R commands from the Rnw file. That is called "tangling" in the R parlance. I would have called it "untangling", since it extracts the R commands that are embedded in the Rnw file. But I didn't invent the whole process, I'm just a fan of it. Thanks to Professor F. Leisch for thinking of the whole idea, working it out beautifully, and documenting it excellently.

There's a wrinkle. My Rnw file has the option to split the output of each command into separate files in the plot directory.  (look in the web site, you'll see a "plots" folder. That makes the output much easier to work with and integrate into documents in a "fine grained" user controlled way. I've got separate little output files, I can input them wherever I want.

However, that approach does not generate a single R file when the tangling is done. Joy of joys, however. I can use perl to change the split option in the Rnw file AFTER it has generated its output, and then I tangle that file after changing the split setting to false. Genius. I'm sorry to brag, but this has been a pain in the rump for about 3 years.

How to achieve this long sought after magic?  The magic bean is the abuse & misuse of LyX's copier customization. The usual document export would create the first pdf and copy back to the document directory. A copier script does that.

Today I realized that I can hack a new copier script that completes that first part, returns the pdf output to my document directory,  but then also does the other bits of work as a part of the same motion.  The script looks at the output target directory, which is my original document directory.  It follows the pdf output file back there, it scans the files, and then processes them to create a new article format document.  I can't say this is the most perfect idea ever, but it is the first way I've found that appears to work in a practical way.

How? fiddle together some scripts, integrate them into LyX preferences.  I've never had much luck using the LyX graphical preference editor, but I do succeed while editing the preference file in the lyx configuration directory.

About the Sweaving: I've been doing it this way for a decade, before LyX integrated its own approach to Sweave. I prefer my approach for various reasons. My "pjconvert-small.sh" is doing essentially the same work as the LyX build in sweave->pdf translation.  The built in thing in LyX does  not have the automatic multi-generation of outputs that I'm aiming for here. And it leaves some trash files like Rplots.pdf in my document directory. But otherwise, it works in exactly the same way.

I'm afraid I'll forget the details, so here they are. All the scripts I write have pj at the front because I need to keep them separated from the scripts that other people that are piling up. And I'm egotistical.

If, in the future, I have to cave in and let the built in LyX converter do its work, the copier can still be used with that so as to generate the R and article output after the main pdf is created. Yay. I'm so happy.

In the ~/.lyx/preferences, put some new document types, converters, and classes.

<pre>

# FORMATS SECTION ##########################

\format "fen" "fen" "FEN" "" "auto" "auto" ""

\format "agr" "agr" "Grace" "" "auto" "auto" "vector"

\format "pdflatex6" "tex" "LaTeX (pdflatex 6)" "" "" "%%" "document,menu=export"

\format "pdflatex7" "tex" "LaTeX (pdflatex 7)" "" "" "%%" "document,menu=export"

\format "noteedit" "not" "Noteedit" "" "auto" "auto" "vector"

\format "pdf7" "pdf" "PDF (Beam + Article)" "" "evince" "" "document,vector,menu=export"

\format "pdf6" "pdf" "PDF (Beamer)" "" "evince" "" "document,vector,menu=export"

\format "pdf2" "pdf" "PDF (pdflatex)" "F" "evince" "" "document,vector,menu=export"

\format "r" "R" "R/S code" "" "" "" "document"

\format "tgif" "obj" "Tgif" "" "auto" "auto" "vector"

# CONVERTERS SECTION ##########################

#

\converter "sweave" "pdflatex7" "pjconverter.sh $$p$$i $$p $$r $$b" ""

\converter "sweave" "pdflatex6" "pjconverter-small.sh $$p$$i $$p $$r $$b" ""

\converter "pdflatex7" "pdf7" "pdflatex $$i" "latex=pdflatex"

\converter "pdflatex6" "pdf6" "pdflatex $$i" "latex=pdflatex"

#
# COPIERS SECTION ##########################
#

\copier pdf7 "pjBeamerMultiOutCopier.sh $$i $$o"

##############################.lyx/preferences################

</pre>

And the  required scripts

 

pjconverter.sh

<pre>

#!/bin/bash

## Arguments. $1 is Rnw name

## $2 working directory

## $3 input directory

## $4 input file without extension

 

rm -rf $4.pdf

rm -rf "$2/plots"

rm -rf "$2/*.pdf"

R CMD Sweave $1

if [ -d $2/plots* ]; then

rsync -rav $2/plots* $3;

fi

pjrtangle.sh "$1" "$3"

sleep 1

exit 0

</pre>

pjconverter-small.sh

</pre>

#!/bin/bash

## ONLY creates the pdf file.
## Arguments. $1 is Rnw name
## $2 working directory
## $3 input directory
## $4 input file without extension

rm -rf $4.pdf
rm -rf "$2/plots"
rm -rf "$2/*.pdf"
R CMD Sweave $1
sleep 1
exit 0

</pre>

pjBeamerMultiOutCopier.sh

<pre>

#!/bin/bash
FROMFILE=$1
TOFILE=$2

##First, copy the pdf output that was already made back to document directory
echo "Now try cp $FROMFILE $TOFILE"
cp $FROMFILE $TOFILE

FILEBASE=`basename $2`
FILENOEXT=${FILEBASE%.*}

WORKDIR=`dirname $FROMFILE`

DOCDIR=`dirname $TOFILE`
cd $DOCDIR

cp "$FILENOEXT.lyx" "$FILENOEXT-article.lyx"

perl -pi.bak -e 's/sweavel-beamer/sweavel-beamer-article/' "$FILENOEXT-article.lyx"
lyx -e pdf6 "$FILENOEXT-article.lyx"

mv "$FILENOEXT-article.lyx" /tmp
mv "$FILENOEXT-article.lyx.bak" /tmp

sleep 1
exit 0
</pre>

 

pjrtangle.sh

<pre>

#!/bin/bash

## take an Rnw that has split=T, change to F
perl -pi.bak -e 's/split=T/split=F/' $1

R CMD Stangle $1

FILE=`basename $1`
FILENOEXT=${FILE%.*}

rsync -av $FILENOEXT*.R $2
exit 0

</pre>

 

About pauljohn

Paul E. Johnson is a Professor of Political Science at the University of Kansas. He is an avid Linux User, an adequate system administrator and C programmer, and humility is one of his greatest strengths.
This entry was posted in LaTeX/Lyx, R. Bookmark the permalink.