R on HPC Systems
Content
This course provides an introduction into successfully running R code on HPC clusters, with a focus on learning about language properties that affect performance, as well as providing an overview over different approaches to the parallelization of R code.
In the first part of this course, we establish good practices to avoid some very common performance bottlenecks and will also provide hints on how to run R code in the context of a batch scheduling system.
The second part aims to provide an overview over several commonly-used R parallelization packages, as well as their applicability depending on the problem at hand, and aims to provide minimalist working examples for later reference.
Participation
- A basic understanding of programming concepts (variables, loops, if-then-else constructs) is necessary.
- Basic knowledge of working with the Linux commandline is recommended.
- Basic knowledge of R programming syntax is strongly recommended. This course focuses on using specific R language features and does not provide a general introduction to R programming.
- Sessions will be held online via Zoom. Contact data will be made available after successful registration.
- The hands-on sections will be carried out on an HPC cluster that runs the SLURM batch scheduler. While parts of the exercises can also be run locally, having access to an HPC cluster with SLURM and a working MPI library is necessary for following along with the exercise parts about job scheduling and the RMPI package.
Note: The workshop is limited to 30 participants.
Registration
Registration will be available soon.