Tricks for working with JAGS

  • The Gamma distribution is frequently used as a prior for the precision in a model because of its conjugacy with the Normal. Given JAGS parameterization of the Gamma distribution with a rate and shape parameter, it’s not easy to select sensible hyper-parameters. It’s easier to think of hyper-parameters that correspond to the mean and the variance of a prior for the precision. To use mean and variance hyper-parameters (mu and sigma) with JAGS’s dgamma, use: dgamma( ((mu^2)/sigma), (mu/sigma)).
  • The glm module provides a series of more efficient samplers that JAGS automatically selects whenever possible (e.g. Albert/Chib (1993)). This usually leads to much better mixing. I tend to load the module right away with the actual rjags package, e.i. library(rjags); load.module("glm").
  • To check which samplers are selected by JAGS, use the list.samplers() function on the JAGS model object. This is useful, because sometimes a slight change in the coding helps JAGS to select better samplers. For example, when using a logit model in JAGS and you want the Holmes-Held sampler from the glm module to work, do not use dmnorm() but dnorm() for the prior of the coefficient, e.i. do not use blocking.
  • Avoid deterministic nodes as much as possible to increase speed. In particular, do all pre- and post processing in R including drawing from the predictive distribution and compute deterministic functions inside probabilistic statements, e.g. y ~ dnorm( (x*beta), 1) instead of mu .
  • For comprehensible / readable code, let the running index be the small letter corresponding to the capital letter of the upper limit, e.i. for(n in 1:N) instead of for(i in 1:N). This becomes especially useful when one has many nested hierarchies.
Advertisements

Get a slice of a mcmclist faster

Say you sampled a large amount of parameters from a posterior and stored the parameters in a mcmclist object. In order to analyze your results, you want to slice the mcmclist and only select a subset of the parameters. One way to do this is:

mcmcsample[,i,]

where mcmcsample is your mcmclist with the posterior sample and i the parameter you are interested in. Turns out, this function is very slow. A faster way is to use this little function:

getMCMCSample <- function(mcmclist,i){
  chainext <- function(x,i) return(x[,i])
  return(as.mcmc.list(lapply(mcmclist, chainext, i = i)))
  }

Example:

# Run a toy model
library(MCMCpack)
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
posterior1 <- MCMCpoisson(counts ~ outcome + treatment)
posterior2 <- MCMCpoisson(counts ~ outcome + treatment)
posterior3 <- MCMCpoisson(counts ~ outcome + treatment)
mcmclist <- mcmc.list(posterior1,posterior2,posterior3)

system.time(mcmclist[,2:3,])
system.time(getMCMCSample(mcmclist, 2:3))

The build-in way takes 0.003 sec on my machine, while getMCMCSample gets it done in 0.001. For this little example, the difference is negligible. But as it turns out, for large posterior samples, it really makes a difference.


How to setup R and JAGS on Amazon EC2

Yesterday I setup a Amazon EC2 machine and installed R, JAGS and rjags. Unfortunately I was unable to compile the latest version of JAGS. The configuration step failed with an error that says that a specific LAPACK library is missing. I tried to install the missing library manually but for some reason JAGS doesn’t recognize the new version. After an hour, I decided to go with the older JAGS version. Here is how I installed all three packages.

[If you never worked on an EC2 machine: Here are two [1,2] helpful tutorials how start an EC2 instance and install the command line tool on your computer to access it via ssh.]

Login via ssh. In a first step we install an older version of R. It is necessary to use the older version in order to get JAGS 1.0.4 running later. After downloading the R source file, we unzip it, cd into the folder and start installing a couple of additional libraries and compilers that we need to compile R (and later JAGS). We then configure the R installer without any x-window support and install everything. The last line makes sure that you can start R with simply typing ‘R’.

cd ~
wget http://cran.r-project.org/src/base/R-2/R-2.10.0.tar.gz 
tar xf R-2.10.0.tar.gz
cd ~/R-2.10.0
sudo yum install gcc
sudo yum install gcc-c++
sudo yum install gcc-gfortran
sudo yum install readline-devel
sudo yum install make
./configure --with-x=no
make
PATH=$PATH:~/download/R-2.10.0/bin/

Next, try to start R and install the coda package.

R
install.packages('coda')
q()

Now, we proceed with installing JAGS. First, get some libraries and the JAGS source code. Unzip, cd into the folder and run configure with specifying the shared library folder. The specification of the shared library path is crucial, otherwise rjags fails to load in R.

sudo yum install blas-devel lapack-devel
cd ~/download/
wget http://sourceforge.net/projects/mcmc-jags/files/JAGS/1.0/JAGS-1.0.4.tar.gz/download
tar xf JAGS-1.0.4.tar.gz
cd ~/download/JAGS-1.0.4
./configure --libdir=/usr/local/lib64 
make 
sudo make install

In a last step, download the appropriate rjags version and install it it.

cd ~/download/
wget http://cran.r-project.org/src/contrib/Archive/rjags/rjags_1.0.3-13.tar.gz
R --with-jags-modules=/usr/local/lib/JAGS/modules/ CMD INSTALL rjags_1.0.3-13.tar.gz 

Sources of inspiration. Here, here, here and here.