M2: GPU Architecture & CUDA Execution Model

In this second module, we begin diving into the architecture of a GPU to get a better understanding on how it will affect the performance of a GPU application.

Pre-recorded Lectures

Note: The pre-recorded videos for M2 will be posted after Wednesday’s lecture.

The pre-recorded lectures are available here: M2 Videos. You can also find the videos under the “Panopto” tab on the MPCS 52072 canvas site.

The lectures are a series of approx 20-30 minute videos divided into the following sections:

  • 2.0 CUDA Execution Model (Cont.)

  • 2.1 Control Divergence Example

  • 2.2 Latency Hiding and Occupancy

  • 2.3 Deeper Dive into the Execution Model

  • 2.5: GPU Profiler Commands

Resources/Readings

  • Programming Massively Parallel Processors: A Hands-on Approach
    • Chapter 3

The slides presented in lecture and these videos are accessible on our Canvas Page. Click on the Files link and you then can download the m2.zip file. The slides will be accessible right before class.

Synchronous Session (In-Person Lecture)

As a reminder here are the dates and times for the synchronous session for this module:

  • Dates/Times
    • Tuesday April 1st @ 5:30pm-7:20pm

  • Session Outline
    • Querying and Managing Devices

    • Timing Your Kernel

    • GPU Architecture and CUDA Execution Model

Assignment

Assignments are always due on Friday evenings.