Introduction to CUDA Programming
Content
This course introduces students to the principles and practice of parallel programming for NVIDIA GPUs using CUDA.Through a combination of lectures and practicals, students learn about Nvidia GPU architectures, CUDA C++, fundamental CUDA algorithms and optimization techniques.
Agenda
Day 1 (Wednesday, September 23)
- 09:00 Welcome
Overview of the course
Connecting to the Host Cluster required for the hands-on sessions - 09:30 Lecture 1: Introduction to GPU Architecture and Cuda C++
GPU Architecture and fundamental differences with CPU
Basic concepts, syntax and API needed or heterogenous programming with CUDA C++
How to write GPU kernels and manage GPU thread groups
CUDA API and error Handling - 10:45 Break
- 11:00 Hands-on
Participants can complete example exercises meant to reinforce the presented concepts - 13:00 Lunch Break
- 14:00 Lecture 2: Fundamental CUDA Optimization
Optimization strategies related to kernel launch configurations & hiding latency
GPU Memory Hierarchy
Global memory throughput and use of shared memory - 15:15 Break
- 15:30 Hands-on
- 17:30 Closing Day One
Day 2 (Thursday, September 24)
- 09:00 Lecture 3: CUDA Programming & Parallel Patterns
Thread execution, memory access and atomic operations
Fundamental Parallel algorithms: reduction, scan and histogram
GPU Managed memory: API and performance optimization - 10:15 Break
- 10:30 Hands-on
- 12:30 Lunch Break
- 13:30 Lecture 4: CUDA Concurrency
CUDA concurrency: Motivation and possible scenarios
Pinned memory
CUDA Streams concept and API
Overlapping Computation and data transfer - 14:45 Break
- 15:00 Hands-on
- 17:00 Final Remarks
Trainer(s)
- Fouzhan Hosseini (NAG)
- Jacob Senior (NAG)
- Eleni Vlachopoulou (NAG)
Participation
- The target audience is people who are reasonably competent programmers, and in particular we assume they are familiar with basic C, but no other knowledge is assumed.
- Sessions will be held online via Zoom. Contact data will be made available after successful registration.
- The hands-on session will be carried out through an SSH connection on the Lichtenberg cluster at Technical University Darmstadt. Participants shall use their own computer with either Linux/MacOS or Windows with MobaXterm installed. Participants without account to Lichtenberg will be provided with a guest account for the course.
- Participants will be provided access to Lichtenberg prior to the course to be able to familiarize themselves with the environment and the material.
- NAG will provide PDFs of the lecture material. Example code and solutions, as source files, will be supplied as appropriate. Printed course materials are not offered.
- Practical example classes will be conducted via Slack. Attendees will have the opportunity to post questions, compiler error messages, code snippets etc. for discussion.
- The maximum number of participants is 30.