Instructor: | John Reppy | Ryerson 256 |
Lecture: | M 3-5 (Ry 255) | |
SS-106 |
The focus of this seminar will be on high-level languages and models for programming GPUs. We will begin by looking at the architectural features of GPUs that both make them very fast and very difficult to program. With that background in place, we will read and discuss recent (and some not-so-recent) papers on languages and models for GPUs.
Note: The seminar was originally scheduled to meet twice a week, but we are meeting once a week for two hours instead. The new meeting time and location is on Mondays from 3pm to 5pm in Ry 255.
For week 2, please take a look at the paper Parallel Prefix Sum (Scan) with CUDA, which was Chapter 39 of GPU Gems 3.
For week 3, we will look at an approach for handling tree traversals in GPU programs that has been developed by researchers at Purdue. There are two papers:
More discussion of the techniques from last week. One additional paper: There are two papers:
This week we will look at ray tracing on GPUs and the use of persistent threads as an implementation technique. There are several papers:
Understanding the Efficiency of Ray Traversal on GPUs
(Proceedings of High Performance Graphics 2009).
This paper describes the difficulties with implementing ray tracing on a GPU and
possible solutions, including the use of persistent threads.
GPU Ray Tracing
(CACM 2013).
This paper describes the OptiX system from NVIDIA. The
original version was presented
at SIGGRAPH 2010.
A Study of Persistent Threads Style GPU Programming for GPGPU Workloads
(Innovative Parallel Computing 2012).
This week we will look at a couple of low-level languages that have been designed for GPU programming.
HiDP: A Hierarchical Data Parallel Language
(CGO 2013).
NOVA: A Functional Language for Data Parallelism
(Array '14).
Size Slicing - A Hybrid Approach to Size Inference in Futhark
(FHPC '14).
This week we will look at several papers on flattening nested-data parallelism:
This week we will look at more papers on flattening nested-data parallelism:
Flattening Trees
(EuroPar 1998).
On the Distributed Implementation of
Aggregate Data Structures by Program Transformation
(11th IPPS/SPDP 1999)
Nepal – Nested Data Parallelism in Haskell
(EuroPar 2001).
This week we will finish up a discussion of flattening and then look at some papers on piecewise execution of NDP programs.
This week we will look at some optimization techniques for NDP.
Vectorisation Avoidance
(Haskell Workshop 2012).
Fusing Filters with Integer Linear Programming
(FHPC 2014).