Tricks for working with JAGS

  • The Gamma distribution is frequently used as a prior for the precision in a model because of its conjugacy with the Normal. Given JAGS parameterization of the Gamma distribution with a rate and shape parameter, it’s not easy to select sensible hyper-parameters. It’s easier to think of hyper-parameters that correspond to the mean and the variance of a prior for the precision. To use mean and variance hyper-parameters (mu and sigma) with JAGS’s dgamma, use: dgamma( ((mu^2)/sigma), (mu/sigma)).
  • The glm module provides a series of more efficient samplers that JAGS automatically selects whenever possible (e.g. Albert/Chib (1993)). This usually leads to much better mixing. I tend to load the module right away with the actual rjags package, e.i. library(rjags); load.module("glm").
  • To check which samplers are selected by JAGS, use the list.samplers() function on the JAGS model object. This is useful, because sometimes a slight change in the coding helps JAGS to select better samplers. For example, when using a logit model in JAGS and you want the Holmes-Held sampler from the glm module to work, do not use dmnorm() but dnorm() for the prior of the coefficient, e.i. do not use blocking.
  • Avoid deterministic nodes as much as possible to increase speed. In particular, do all pre- and post processing in R including drawing from the predictive distribution and compute deterministic functions inside probabilistic statements, e.g. y ~ dnorm( (x*beta), 1) instead of mu .
  • For comprehensible / readable code, let the running index be the small letter corresponding to the capital letter of the upper limit, e.i. for(n in 1:N) instead of for(i in 1:N). This becomes especially useful when one has many nested hierarchies.

Get a slice of a mcmclist faster

Say you sampled a large amount of parameters from a posterior and stored the parameters in a mcmclist object. In order to analyze your results, you want to slice the mcmclist and only select a subset of the parameters. One way to do this is:

mcmcsample[,i,]

where mcmcsample is your mcmclist with the posterior sample and i the parameter you are interested in. Turns out, this function is very slow. A faster way is to use this little function:

getMCMCSample <- function(mcmclist,i){
  chainext <- function(x,i) return(x[,i])
  return(as.mcmc.list(lapply(mcmclist, chainext, i = i)))
  }

Example:

# Run a toy model
library(MCMCpack)
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
posterior1 <- MCMCpoisson(counts ~ outcome + treatment)
posterior2 <- MCMCpoisson(counts ~ outcome + treatment)
posterior3 <- MCMCpoisson(counts ~ outcome + treatment)
mcmclist <- mcmc.list(posterior1,posterior2,posterior3)

system.time(mcmclist[,2:3,])
system.time(getMCMCSample(mcmclist, 2:3))

The build-in way takes 0.003 sec on my machine, while getMCMCSample gets it done in 0.001. For this little example, the difference is negligible. But as it turns out, for large posterior samples, it really makes a difference.