This is an old revision of the document!

An Introduction to GPU Driving with OpenCL

Press the gas pedal of a Venom GT car to the max and you can reach a speed of over 400 km/h. Ask Andy Roddick to show you a fast serve and you will hear the sound that a tennis ball makes when flying at a speed of almost 250 km/h.

Now imagine not just two, not tens, but thousands of acceleration pedals being pressed to the floor at the same time. All in parallel! Imagine thousands of powerful tennis serves and the sound that they make. All in parallel!

No, Clarkson will not be your trainer for this workshop, nor will Andy Roddick explain to you the secret recipe for the perfect forehand.

But, as we still have to quench our thirst for speed and performance, we’ll take a look together at the hundreds or thousands of cores that your computer probably has and we will teach you the fundamentals for starting your own experiments with parallel burning cores.

Throughout the course you will learn the basics of OpenCL parallel programming paradigm with a focus on GPUs. While getting familiar with the OpenCL concepts, you will have to add OpenCL functionalities to an existing image processing C application and port the existing algorithms to run on the GPU.

Can you make it run faster? How much faster?

When and Where?

September 12th - September 16th 2015.

Date Time Room
September 12th 2015 10:00-13:00 EG304
September 13th 2015 10:00-13:00 EG304
September 14th 2015 18:00-20:30 EG304
September 15th 2015 18:00-20:30 EG304
September 16th 2015 18:00-20:30 EG304

Workshop Agenda


  • Theory
    • OpenCL platforms, hosts and devices
    • Compute units, work groups, work items
    • Memory hierarchy
  • Lab session
    • Detect available OpenCL platforms and devices on your system
    • Query capabilities of the detected OpenCL platforms and devices


  • Theory
    • OpenCL execution model
    • Kernels, queues, synchronization
    • Memory objects
    • The OpenCL language
  • Lab session
    • How to map work items on the problem space
    • Transfer data to/from GPU
    • Implement the OpenCL kernel for the first image processing operation (IPO1)
    • Transfer data to GPU and back from the GPU


  • Theory
    • Profiling, events
  • Lab session
    • Profile the kernel for IPO1
    • Implement the kernel for the second image processing operation (IPO2)
    • Profile and analysis
    • Implement the kernel for the third image processing operation (IPO3)


  • Theory
    • Recap memory hierarchy and memory objects
    • Synchronization across work items
  • Lab session
    • Profile and optimize the kernels


  • Theory
    • Images
  • Lab session
    • Profile and optimize the existing OpenCL implementation

Target Audience and Prerequisites

If you are interested in learning the fundamentals of OpenCL or simply eager to take a first step in the world of parallel programming with GPUs, then you're definitely part of the target audience. You are expected to be familiar with computer architecture and have good C programming knowledge.


To register for this workshop, please fill in the form. Please try to just be yourself and provide honest and simple answers. We want to get a better idea about what you already know and what you would like to learn, but also to polish the last details of the training materials according to your requirements and preference. For any questions regarding this workshop, please feel free to contact the trainer.

Registration is now closed.

About the Organizers

The workshop is organized by ROSEdu in partnership with StreamComputing.

We, the people at StreamComputing, are crazy about speed and performance. We specialize in optimizing software, by means of GPUs, multi-core CPUs, FPGAs or any other kind of hardware that usually lays around unused by normal applications. When people need faster code, that's when we come in.

Course Staff

Trainer: Anca Hamuraru

Assistant Trainer: Albert Zaharovits

sesiuni/opencl.1443597960.txt.gz · Last modified: 2015/09/30 10:26 by ahamuraru