POLS 707 PJ Fun with R! The preferred method of using R is via a text editor called "Emacs" and the ESS (Emacs Speaks Statistics) mode of Emacs. 1. Log in. Open a terminal. First, make sure R is installed by running it from the command line. At the prompt (#) type R and hit return. # R If R starts, you will know because a special prompt appears. Make sure it works by doing these commands in R: > x <- rnorm(100) > hist(x) That should show a histogram of x. If it works, then everything is setup properly and you can quit the R session by typing > q() and then answer "n" when it asks if you want to save your session. Next, find your terminal or open a new one. This time, start the Emacs program with this command: > emacs Rtest.R and after that, check out the very bottom part of the Emacs window. It is a status and command bar. Alt-x and in the bottom line, it will prompt you, saying M-x (In Emacs, M stands for Alt. C stands for Control) When it prompts you, type a capital R, and hit return. Before you hit return, it will look like this: M-x R After you do that, it will say ESS [S(R): R] starting data directory? ~/ ~ is your home directory, you can change that if you want, but leave it for now. After that, R will start inside an Emacs windows. Using the newest Emacs and ESS, as configured in our Lab or on systems that use the packages I built, R will appear in its own "buffer" (Emacs word for window). You can type commands directly into the R window if you want, just as you did earlier. However, I want you to try a more sophisticated approach. Type your commands into the new file you are creating, Rtest.R, and then when they are ready, you can send them over to R. This is the ESS magic at work. In the Emacs window Rtest.R, type this: x <- rnorm(100) hist(x) then highlight that with the mouse, then send the commands to R. You can send this small region of commands to R in 2 ways. 1. There's an icon with an arrow -> and three lines under it. If you hover the mouse over it, it will say "Eval Region". Hit that, it will send the command over. 2. After highlighting the region, click on the ESS item in the top menu bar, pull down to "ESS and go" and choose "Eval region". The only difference here is that since you chose the "go" option, the focus will change to the R window. Of course, you can also send over the whole buffer (er, file, window) if you want. Look at the ESS menu items, its easy. Note the beauty of this is that you can save the file Rtest.R and you have an exact record of what you have done. Please remember I've documented the answers to many many specific questions already in http://www.ku.edu/~pauljohn/R/Rtips.html 2. Try these commands. You can type into the R terminal if you want, or into your file Rtest.R. > help.start() make a note of what it does. It should launch a web browser that shows the index of R documentation. Look around in there for a half hour or so. Please note that some packages are always loaded in R. The "base" package is always available. Other packages are not automatically loaded and you can only use them if you type the correct library(...) command. To see what packages exist in your computer, type this > library() That output will show in your R buffer. To quit R, you have to run the quit function, which is > q() There are several introductions to R available. In reserve I have the Venables and Ripley book Modern Applied Statistics, as well as Myatts tutorial and Maindonald's book. If they are gone, tell me. 4. Let's try to read in some numbers with the read.table function. Still in Emacs, from the File pull down menu, click "Open File". In the bottom command line it will ask for a file name, and I want you to create a new file called "testdata.txt". In the Editor, create a silly data set with the names in the first row, like: x1 x2 x3 x4 x5 22 1 22 11 11 22 33 11 323 11 22 1 33 55 76 55 33 33 44 33 Don't forget to "hit return" at the end of the last one. You'll get an error if you don't. I saved my data "testdata.txt". Save that. Then, in R, try to read that in. > read.table("testdata.txt",header=TRUE) R looks in the "start in" directory for R, which is the same place where you saved testdata.txt, unless something went wrong. If you saved the data somewhere else, then you will have to give the path to the file in there, as in: > read.table("/home/paul/MyRStuff/testdata.txt",header=TRUE) Please please note. It wants forward slashes, not the retarded backward slashes. The data should show. BUT IT IS NOT SAVED. To save it as a "data frame" you have to do this: > paultest <- read.table("testdata.txt",header=TRUE) (that's all on one line) Now look over your new data frame: > paultest > paultest$x1 > paultest$x2 Note you can access the first row only this way too: > paultest[1,] And if you just put in one number there, it assumes you want a column. > paultest[1] Now try: > attributes(paultest) > objects() > help.search("data.frame") Want to look only at cases for which the first variable, x1, equals 55. > paultest[paultest$x1==55,] Note the double equal sign for "logical equal to". That's standard in many computer languages. You can make it simpler by attaching the paultest dataset: > attach(paultest) > paultest[x1==55,] You can create new variables by doing any mathematical calculations available in R. There are tons of them. > new1 <- log(paultest$x1) Now "new1" is a variable that exists in your worksheet. It is not part of the dataframe paultest, however. You have to tell it if you want that to happen: > paultest$new2 <- log(paultest$x1) 5. Let's make a histogram: > hist(paultest$x4) > You can read more on hist options if you ask for help > ?hist If you want to run their example histograms, do > par(ask=T) (That forces it to stop and ask if you want to see the next) > example(hist) If you just want a textual printout of the counts frequency distribution,try > table(paultest$x4) 6. We want to make a cross tabulation! > help.search("cross tabulation") > example(table) > objects() wow. your workspace is full of objects from the table example! Create a table of counts: > table(x1,x2) It will look ugly like this: x2 x1 1 33 22 2 1 55 0 1 Apparently, they expect us to make some calculations of column percentages on our own. Try > mytable <- table(x1,x2) To get "total proportions", do this: > prop.table(mytable) To get "column proportions", do this: > prop.table(mytable,2) The 2 refers to "second dimension". If you are burning for percentages, do > 100*prop.table(mytable,2) If you did > prop.table(mytable,1) you would get row proportions. Next, load the package called "gregmisc." In the help pages for that package, you should find a method called CrossTable that will make more beautiful cross tabulation tables. Give it a try. 7. In case you wonder what a regression model is like in R, let me explain they follow the tradition of the S computer language, which is extremely powerful but somewhat different from what I'm used to. A regression "formula" has the dependent variable on the left, then a tilde, and then the independent variables. As in > lm(x1 ~ x2 +x3+x4, data=paultest) Usually you don't want just to print to the screen, so save the numbers: > regtest1 <- lm(x1 ~ x2 +x3+x4, data=paultest) That does not show anything, but you can find out about regtest1 by typing commands like: > summary(regtest1) > attributes(regtest1) > regtest1