Skip to content

Heterogeneous Parallel Programming

January 25, 2013

I recently finished the Coursera free online course titled Heterogeneous Parallel Programming presented by the University of Illinois. The majority of the course was focused on the NVidia CUDA programming environment that allowed API access to a computer graphics card’s GPU (Graphics Processing Unit). Typical CUDA applications running on the GPU hardware can full utilize multiple cores with many threads for each core, constantly swapping from run to wait as I/O operations complete.

The class tested student’s knowledge with 5 quizzes and 5 programming assignments. The programming assignments were executed in a remote environment, and student interaction with this environment was completely via web browser. We edit and run data sets to test our code all via a browser. The code was exclusively written in C, and the CUDA extensions which allowed execution of CUDA “kernels”. Most of the programming assignments focused on the basics of CUDA, and how to allocate, transfer, and process data on the GPU. Half the assignments involved some sort of matrix operation. The other half were implementing some parallel algorithms like reduction, prefix scan, and convolution.

The valuable points I took away from this class was how much processing power goes un-utilized in most computing hardware. As a developer, we think of the latency of I/O operations (network, disk) as needing to be minimized so we can maximize throughput, but this class shows that even that leaves a lot of power unused. Some of the tips were about how to judiciously load parts of the data into “__shared__” memory on GPU chip to avoid the costly access of main memory (DRAM), being on the order of 500 CPU clock cycles.

The cheapness of manufacturing GPU processors and lower power requirements continues to provide more and more cores and therefore more and more computing power. This class gives a great introduction into how this hardware can be leveraged with algorithms and approaches. Even the JVM is working on exploiting this extra GPU power via Project Sumatra.


From → General

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: