Running Parallel MCMC without specific R packages

Running multiple, distinct Markov chains that have a posterior density as their equilibrium density is useful to use convergence diagnostics such as the Gelman / Rubin diagnostic (Gelman & Rubin, 1992). In order to save time, one likes to run each Markov chain in parallel. There are a number of packages in R (parallel, snow,…) to run code in parallel. However, I find it more useful to write a script for a single chain and then dispatch it multiple times from the terminal. Here is an example with the MCMCpack: 

Save the following as “sim.R”:

substream <- as.numeric(commandArgs(trailingOnly = TRUE))

library(MCMCpack)
data(birthwt)

filename <- paste("posterior_", substream, ".Rdata" , sep="", collapse="")
out <- MCMCprobit(low~as.factor(race)+smoke, data=birthwt, 
 b0 = 0, B0 = 10, seed=list(NA, substream))
save(out,file=filename)

The first line grabs the seed argument from the terminal and pass it to MCMCprobit function which generates a single chain. The script can be executed from the terminal with four different substream seeds as follows:

Rscript --arch=x86_64 sim.R 1 & Rscript --arch=x86_64 sim.R 2 & Rscript --arch=x86_64 sim.R 3 & Rscript --arch=x86_64 sim.R 4

If you don’t have a 64-Bit machine, use “x86_32”. Provided that you have four cores, one core will be used for each of these jobs. Next, you can fuse the output together in an mcmc.list object and run the Gelman/Rubin diagnostic.

library(coda)
load("posterior_1.Rdata")
out1 <- out
load("posterior_2.Rdata")
out2 <- out
load("posterior_3.Rdata")
out3 <- out
load("posterior_4.Rdata")
out4 <- out

posteriors <- as.mcmc.list(list( as.mcmc(out1), as.mcmc(out2), 
 as.mcmc(out3), as.mcmc(out4)))
gelman.diag(posteriors)

The great benefit of this approach is that the R code is less crammed with parallelization functions and the process can be monitored since the output is dumped right into the terminal window.

References
Gelman, Andrew & Donald B. Rubin. 1992. “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7(4):457–472.

Advertisements

One Comment on “Running Parallel MCMC without specific R packages”

  1. Amit says:

    Hello,

    I want to parallelize the MCMC pack function MCMChregress which is optimized in C++ but uses a single core.

    What would be the right way to do that? From R or from C++? One option would be to use MPI and recompile function and its linkage with R.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s