Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Parallel programming in R

Bjrn-Helge Mevik
Research Infrastructure Services Group, USIT, UiO

RIS Course Week spring 2013

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

1 / 13

Introduction

Simple example

Practical use

The end. . .

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

2 / 13

Introduction

Background

R is single-threaded

There are several packages for parallel computation in R, some of


which have existed a long time, e.g. Rmpi, nws, snow, sprint,
foreach, multicore

As of 2.14.0, R ships with a package parallel

R can also be compiled against multi-threaded linear algebra libraries


(BLAS, LAPACK) which can speed up calculations

Todays focus is the parallel package.

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

3 / 13

Introduction

Overview of parallel
I

Introduced in 2.14.0

Based on packages multicore and snow (slightly modified)

Includes a parallel random number generator (RNG); important for


simulations

Particularly suitable for single program, multiple data (SPMD)


problems

Main interface is parallel versions of lapply and similar

Can use the CPUs/cores on a single machine (multicore), or several


machines, using MPI (snow)

MPI support depends on the Rmpi package (installed on Abel)

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

4 / 13

Simple example

Simple example: serial


I

parallel provides substitutes for lapply, etc.

Silly example for illustration: caluclate (1:100)2

Serial version:
## The worker function to do the calculation:
workerFunc <- function(n) { return(n^2) }
## The values to apply the calculation to:
values <- 1:100
## Serial calculation:
res <- lapply(values, workerFunc)
print(unlist(res))
Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

5 / 13

Simple example

Simple example: mclapply


I
I
I
I
I

Performs the calculations in parallel on the local machine


(+) Very easy to use; no set-up
(+) Low overhead
(-) Can only use the cores of one machine
(-) Uses fork, so it will not work on MS Windows

workerFunc <- function(n) { return(n^2) }


values <- 1:100
library(parallel)
## Number of workers (R processes) to use:
numWorkers <- 8
## Parallel calculation (mclapply):
res <- mclapply(values, workerFunc, mc.cores = numWorkers)
print(unlist(res))
Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

6 / 13

Simple example

Simple example: parLapply


I

Performs the calculations in parallel, possibly on several nodes

Can use several types of communications, including PSOCK and MPI


PSOCK:

I
I
I
I

(+) Can be used interactively


(-) Not good for running on several nodes
(+) Portable; works everywhere
=> Good for testing

MPI:
I
I
I
I
I

(-) Needs the Rmpi package (installed on Abel)


(-) Cannot be used interactively
(+) Good for running on several nodes
(+) Works everywhere where Rmpi does
=> Good for production

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

7 / 13

Simple example

Simple example: parLapply (PSOCK)


workerFunc <- function(n) { return(n^2) }
values <- 1:100
library(parallel)
## Number of workers (R processes) to use:
numWorkers <- 8
## Set up the cluster
cl <- makeCluster(numWorkers, type = "PSOCK")
## Parallel calculation (parLapply):
res <- parLapply(cl, values, workerFunc)
## Shut down cluster
stopCluster(cl)
print(unlist(res))
Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

8 / 13

Simple example

Simple example: parLapply (MPI)


simple_mpi.R:
workerFunc <- function(n) { return(n^2) }
values <- 1:100
library(parallel)
numWorkers <- 8
cl <- makeCluster(numWorkers, type = "MPI")
res <- parLapply(cl, values, workerFunc)
stopCluster(cl)
mpi.exit() # or mpi.quit(), which quits R as well
print(unlist(res))
Running:
mpirun -n 1 R --slave -f simple_mpi.R
Note: Use R >= 2.15.2 for MPI, due to a bug in earlier versions of parallel.
Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

9 / 13

Practical use

Preparation for calculations

Write your calculations as a function that can be called with lapply

Test interactively with lapply serially, and mclapply or parLapply


(PSOCK) in parallel

Deploy with mclapply on single node or parLapply (MPI) on one or


more nodes

For parLapply, the worker processes must be prepared with any


loaded packages with clusterEvalQ or clusterCall.

For parLapply, large data sets can be exported to workers with


clusterExport.

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

10 / 13

Practical use

Extended example

(Notes to self:)
I

Submit jobs

Go through scripts

Look at results

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

11 / 13

Practical use

Efficiency

The time spent in each invocation of the worker function should not
be too short

If the time spent in each invocation of the worker function vary very
much, try the load balancing versions of the functions
Avoid copying large things back and forth:

I
I
I

Export large datasets up front with clusterExport (for parLapply)


Let the values to iterate over be indices or similar small things
Write the worker function to return as little as possible

Reduce waiting time in queue by not asking for whole nodes; if


possible, use --ntask instead of --ntasks-per-node + --nodes.

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

12 / 13

The end. . .

Other topics
There are several things we havent touched in this lecture:
I

Parallel random number generation

Alternatives to *apply (e.g. mcparallel + mccollect)

Lower level functions

Using multi-threaded libraries

Other packages and tecniques

Resources:
I

The documentatin for parallel: help(parallel)

The book Parallel R, McCallum & Weston, OReilly

The HPC Task view on CRAN:


http://cran.r-project.org/web/views/
HighPerformanceComputing.html

Bjrn-Helge Mevik (RIS)

Parallel programming in R

Course Week spring 2013

13 / 13

You might also like