R Bioconductor Install (or, where’s my “graph” package?)

Users reported that an R package I use, gRbase, "doesn't work". I had to watch them try it and read the error messages.

When you try

> install.packages("gRbase", dep = TRUE)

The packages "gRbase" depends on other packages, like "graph", and the messages tell you that those other things are unavailable. So the install fails. Then you try to install graph, and that fails. It says that there is no "graph" for R version 2.15.3.

Following up, I realized that "graph" is now on Bioconductor with some other really good packages. Bioconductor is a supplementary repository, one of the leaders in Bioconductor is Robert Gentleman, who was a founder of R.

How to install that? Read at

http://bioconductor.org/install

The simplest way is to rely on the script they wrote to configure everything automatically.

> source("http://bioconductor.org/biocLite.R")
> biocLite("graph")

Here's what I see:

> biocLite("graph")

BioC_mirror: http://bioconductor.org
Using Bioconductor version 2.11 (BiocInstaller 1.8.3), R version 2.15.
Installing package(s) 'graph'
trying URL 'http://bioconductor.org/packages/2.11/bioc/src/contrib/graph_1.36.2.tar.gz'
Content type 'application/x-gzip' length 941743 bytes (919 Kb)
opened URL
==================================================
downloaded 919 Kb

* installing *source* package ‘graph’ ...
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG      -fpic  -O2 -pipe -g  -c graph.c -o graph.o
In file included from graph.c:4:0:
/usr/share/R/include/R_ext/RConverters.h:32:2: warning: #warning "R_ext/RConverters.h was deprecated in R 2.15.1 and will be removed in R 3.0.0" [-Wcpp]
gcc -std=gnu99 -shared -o graph.so graph.o -L/usr/lib/R/lib -lR
mv graph.so BioC_graph.so
installing to /home/pauljohn/R/x86_64-pc-linux-gnu-library/2.15/graph/libs
** R
** data
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
   ‘GraphClass.Rnw’
   ‘MultiGraphClass.Rnw’
   ‘clusterGraph.Rnw’
   ‘graph.Rnw’
   ‘graphAttributes.Rnw’
** testing if installed package can be loaded

* DONE (graph)

The downloaded source packages are in
        ‘/tmp/RtmpqukmNQ/downloaded_packages’
Old packages: 'Amelia', 'ape', 'ash', 'BayesX', 'bdsmatrix', 'Boruta', 'caret',
  'clue', 'corpcor', 'dse', 'Ecdat', 'edgeR', 'eha', 'ENmisc', 'entropy',
  'FactoMineR', 'FAiR', 'ff', 'FitAR', 'flowCore', 'flsa', 'functional',
  'gdata', 'gtools', 'hexbin', 'isa2', 'klaR', 'kohonen', 'ks', 'laeken',
  'limma', 'lpSolve', 'mboost', 'mirt', 'miscTools', 'multcomp', 'mvabund',
  'np', 'pan', 'partykit', 'pcaPP', 'pequod', 'pracma', 'pspline', 'randomLCA',
  'raster', 'Rcpp', 'RcppArmadillo', 'RCurl', 'relations', 'rgeos', 'RJSONIO',
  'RWeka', 'RWekajars', 'sandwich', 'semTools', 'seriation', 'sets',
  'sgeostat', 'shiny', 'sp', 'spdep', 'statmod', 'svMisc', 'testthat',
  'tweedie', 'vegan'
Update all/some/none? [a/s/n]:
Update all/some/none? [a/s/n]: n
Warning message:
installed directory not writable, cannot update packages 'actuar', 'akima',
  'ape', 'arm', 'bdsmatrix', 'Biobase', 'BiocGenerics', 'car', 'colorspace',
  'corpcor', 'degreenet', 'devtools', 'dichromat', 'digest', 'distr',
  'distrEx', 'effects', 'FAiR', 'flexmix', 'flowCore', 'flowViz', 'getopt',
  'ggplot2', 'gpclib', 'graph', 'gRbase', 'hexbin', 'HH', 'igraph', 'igraph0',
  'iplots', 'kernlab', 'ks', 'latentnet', 'lavaan', 'lawstat', 'locfit',
  'maptools', 'memisc', 'mice', 'network', 'optparse', 'party', 'pcaPP',
  'prodlim', 'psych', 'qgraph', 'quantreg', 'RandomFields', 'RBGL', 'Rcmdr',
  'RCurl', 'relevent', 'rgenoud', 'rgl', 'Rgraphviz', 'rJava', 'rms', 'rpanel',
  'segmented', 'sem', 'shapes', 'slam', 'sna', 'snow', 'sp', 'SparseM',
  'spatstat', 'splancs', 'startupmsg', 'survey', 'SweaveListingUtils',
  'TeachingDemos', 'testthat', 'trust', 'XML', 'Zelig', 'cluster', 'foreign',
  'KernSmooth', 'lattice', 'Matrix', 'nnet', 'rpart'
>

I said "n". Maybe you should say yes.

That script tries to do a lot of things, including a global update of all R packages, which you may or may not want. It asks if you want that, say "y" and it will do its thing. I would NOT do that on a "production" computer, I can't trust somebody else's script to fiddle around with my installation. I'd want to understand all of the details. Maybe on your laptop, it OK to say yes.

I asked around about Bioconductor installs and tried to find out why they do it that way. It is simply for user convenience, there is no evil purpose.

If you want to understand how package repositories can be used, and libraries of R packages can be managed, then you don't have to rely on that script. On our server systems, I don't use that script.

To make this work, you need 2 settings. First, you need to decide what directory in your PC will hold the installed package. That is called the "library". To see the directories where you currently have libraries, run this command. I'll show my output:

> .libPaths()
[1] "/home/pauljohn/R/x86_64-pc-linux-gnu-library/2.15"
[2] "/usr/local/lib/R/site-library"
[3] "/usr/lib/R/site-library"
[4] "/usr/lib/R/library"

If I want to throw Bioconductor packages into one of those libraries, all is well.

Read the help page on R's install.packages() function. Note it has arguments for lib and repo.
You explicitly tell R where to get packages and where to put them.

> install.packages("graph", lib="/some/path/you/want", repos="http://some.valid.package.server/whatever")

If you put "/some/path/you/want" that is different from your .libPaths() output, then you need to think about how to use the packages that you installed in a place that is not in your path. That can be done by changing settings in the R startup environment. Or by editing the variable in .libPaths(). Either way, we can work it out.

But on your own computer, don't bother setting lib, just leave out that argument, R will try to install in item [1] in your .libPaths()

On our cluster, I have Bioconductor stuff installed to a separate directory, but here I might be lazy and just use .libPaths()[1], in my home directory.

On the cluster, here's the R script I run to get their packages, which drops
the results into "/tools/lib64/R/bioconductor". As you can see, when I ran that,
the correct version of the BIOCREPO variable was versioned at 2.10. I think now
the correct Bioconductor version is 2.11. Note Bioconductor versions DO NOT
mirror R versions. I'm in R 2.15, but the Bioc version is 2.10 below, and I should
raise that to 2.11.

BIOCDIR < - "/tools/lib64/R/bioconductor"
BIOCREPO <- "http://www.bioconductor.org/packages/2.10/bioc"

install.packages("BiocInstaller", repos = BIOCREPO, lib = BIOCDIR)

install.packages("limma", repos = BIOCREPO, lib = BIOCDIR)


desiredPackages <- c("graph", "ddgraph", "cellGrowth", "DirichletMultinomial")

install.packages( desiredPackages, repos = BIOCREPO, lib = BIOCDIR, dep = TRUE)

install.packages("RBGL", repos = BIOCREPO, lib = BIOCDIR, dep = TRUE)

install.packages(c("biocGraph", "DEGraph","Rgraphviz"),  repos = BIOCREPO, lib = BIOCDIR, dep = TRUE)

About pauljohn

Paul E. Johnson is a Professor of Political Science at the University of Kansas. He is an avid Linux User, an adequate system administrator and C programmer, and humility is one of his greatest strengths.
This entry was posted in R and tagged , . Bookmark the permalink.