Node-Level Performance Engineering

Node-Level Performance Engineering


This course teaches performance engineering approaches on the compute node level. “Performance engineering” as we define it is more than employing tools to identify hotspots and bottlenecks. It is about developing a thorough understanding of the interactions between software and hardware. This process must start at the core, socket, and node level, where the code gets executed that does the actual computational work. Once the architectural requirements of a code are understood and correlated with performance measurements, the potential benefit of optimizations can often be predicted. We introduce a “holistic” node-level performance engineering strategy and apply it to different algorithms from computational science. Architectural details that are relevant for performance, such as pipelining, SIMD, superscalarity, memory hierarchies, etc., are covered in due detail.

Participants must have basic knowledge in programming with Fortran or C and basic knowledge of OpenMP.


Registration is now open! Please register via the Indico conference management system.


Bachelor / Master Students from Hessen & Rheinland-Pfalz 40 EUR
PhD students from Hessen & Rheinland-Pfalz 60 EUR
Members of German universities and public research institutes 60 EUR
Others 580 EUR

Fee includes coffee breaks, but not lunch & dinner.

Social Events

On the first evening, we plan a dinner (self-paying) at “Zum Storch am Dom

Cluster Computing Course

The Cluster Computing Course is for participants of Node-Level Performance Engineering.

Friday, August 30 2019


Goethe Universität, Campus Riedberg (Frankfurt am Main)


Cluster facts of the GOETHE-HLR & FUCHS cluster:

  • Hardware resources

  • File system

  • Environments modules

  • Partitions on the cluster

  • Architecture of the partitions

Batch Usage:

SLURM is the job scheduler installed on GOETHE-HLR & FUCHS cluster. The session teaches attendees

  • how to prepare a submission script,

  • how to submit, monitor, and manage jobs on the clusters,

  • theory about resource and CPU management.

Travel Information and Accommodation

See our directions, the campus map, and the entrance and room 114 of building N100.

Public transportation

From main railway Station “Hauptbahnhof” with S-Bahn S1 - S9 to “Hauptwache”, then with U-Bahn U8 (direction Riedberg) to “Uni Campus Riedberg”.

Hotel Recommendation


Anja Gerbes, +49 (0)69 798-47356

This course is organized by HKHLR and CSC, Goethe University Frankfurt in cooperation with RRZE &



Day 1:

-  9:30 Welcome - Intro

-  9:45 Computer architecture for software developers

-  10:45  Coffee Break. 15m

-  11:30  Performance Engineering Basics

-  12:00-13:00 Lunch Break

-  13:00  Tools for Performance Engineering 1

-  13:30  Exercise 1: The Bandwidth Benchmark

-  15:15  Coffee Break 15m

-  15:30  Roofline Model: Basics

-  17:00 End of day


Day  2:

-  9:00  Tools for Performance Engineering 2

-  9:45  Optimal use of parallel resources: SIMD, ccNUMA

-  10:30 Coffee Break 15m

-  11:15  Exercise 2: Dense Matrix Vector Multiplication

-  12:00-13:00 Lunch Break

-  13:00  Performance Engineering Basic Skills

-  14:00  Case Study: Roofline Model Jacobi smoother

-  15:15 Coffee Break 15m

-  15:30 Execise 3: Analysis of miniMD proxy app

-  17:00 End of day