R on HPC Systems

R on HPC Systems

Content

This course provides an introduction to successfully running R code on HPC clusters, with a focus on learning about language properties that affect performance, as well as providing an overview of different approaches to the parallelization of R code. 

In the first part, we establish good practices to avoid some very common performance bottlenecks, and we will also provide hints on how to run R code in the context of a batch scheduling system. 

The second part aims to provide an overview of several commonly used R parallelization packages, as well as their applicability depending on the problem at hand, and aims to provide minimalist working examples for later reference.
 


Participation

  • A basic understanding of programming concepts (variables, loops, if-then-else constructs) is necessary.
  • Basic knowledge of working with the Linux commandline is recommended.
  • This course assumes the attendants to be already familiar with R, i.e. the focus is not on how to write R programs in general, but on how to use specific features of the R language.
  • Sessions will be held online via Zoom. Contact data will be made available after successful registration.
  • The hands-on sections will be carried out on an HPC cluster that runs the SLURM batch scheduler. While parts of the exercises can also be run locally, having access to an HPC cluster with SLURM and a working MPI library is necessary for following along with the exercise parts about job scheduling and the RMPI package.

Note: The workshop is limited to 50 participants. It is free of charge and will be held in English.


Click here to register.

Participating Universities