HiPerCH 12 - Module 3

Introduction to CUDA Programming

Content

This course introduces students to the principles and practice of parallel programming for NVIDIA GPUs using CUDA.Through a combination of lectures and practicals, students learn about Nvidia GPU architectures, CUDA C++, fundamental CUDA algorithms and optimization techniques.

Agenda

Day 1 (Wednesday, September 23)

09:00 Welcome
Overview of the course
Connecting to the Host Cluster required for the hands-on sessions
09:30 Lecture 1: Introduction to GPU Architecture and Cuda C++
GPU Architecture and fundamental differences with CPU
Basic concepts, syntax and API needed or heterogenous programming with CUDA C++
How to write GPU kernels and manage GPU thread groups
CUDA API and error Handling
10:45 Break
11:00 Hands-on
Participants can complete example exercises meant to reinforce the presented concepts
13:00 Lunch Break
14:00 Lecture 2: Fundamental CUDA Optimization
Optimization strategies related to kernel launch configurations & hiding latency
GPU Memory Hierarchy
Global memory throughput and use of shared memory
15:15 Break
15:30 Hands-on
17:30 Closing Day One

Day 2 (Thursday, September 24)

09:00 Lecture 3: CUDA Programming & Parallel Patterns
Thread execution, memory access and atomic operations
Fundamental Parallel algorithms: reduction, scan and histogram
GPU Managed memory: API and performance optimization
10:15 Break
10:30 Hands-on
12:30 Lunch Break
13:30 Lecture 4: CUDA Concurrency
CUDA concurrency: Motivation and possible scenarios
Pinned memory
CUDA Streams concept and API
Overlapping Computation and data transfer
14:45 Break
15:00 Hands-on
17:00 Final Remarks

Trainer(s)

Fouzhan Hosseini (NAG)
Jacob Senior (NAG)
Eleni Vlachopoulou (NAG)

Participation

The target audience is people who are reasonably competent programmers, and in particular we assume they are familiar with basic C, but no other knowledge is assumed.
Sessions will be held online via Zoom. Contact data will be made available after successful registration.
The hands-on session will be carried out through an SSH connection on the Lichtenberg cluster at Technical University Darmstadt. Participants shall use their own computer with either Linux/MacOS or Windows with MobaXterm installed. Participants without account to Lichtenberg will be provided with a guest account for the course.
Participants will be provided access to Lichtenberg prior to the course to be able to familiarize themselves with the environment and the material.
NAG will provide PDFs of the lecture material. Example code and solutions, as source files, will be supplied as appropriate. Printed course materials are not offered.
Practical example classes will be conducted via Slack. Attendees will have the opportunity to post questions, compiler error messages, code snippets etc. for discussion.
The maximum number of participants is 30.

HKHLR - HPC Hessen

HiPerCH 12 - Module 3

HiPerCH 12 - Module 3

Introduction to CUDA Programming

Content

Agenda

Trainer(s)

Participation

Participating Universities