This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
sesiuni:opencl [2015/09/30 10:26]
ahamuraru [Trainer]
sesiuni:opencl [2015/10/03 18:37] (current)
Line 100: Line 100:
  \\  \\
 [[ https://​www.linkedin.com/​profile/​view?​id=AAMAABrGlewBEo8RxIGsnUsiVYRtdikgYYbw_6Q&​authType=name&​authToken=lSum&​trk=hp-feed-member-name|{{http://​allthingsgear.com/​wp-content/​uploads/​2013/​07/​linkedinicon.png?​28}}]] [[ https://​www.linkedin.com/​profile/​view?​id=AAMAABrGlewBEo8RxIGsnUsiVYRtdikgYYbw_6Q&​authType=name&​authToken=lSum&​trk=hp-feed-member-name|{{http://​allthingsgear.com/​wp-content/​uploads/​2013/​07/​linkedinicon.png?​28}}]]
 + \\
 + \\
 + \\
 +== After the Workshop ==
 +For some of the participants the lab sessions were simply not enough. So after the workshop we had no other option but to have a small competition for them.
 +The participants were given a functional implementation of an algorithm in C and OpenCL. There were two goals: to get the best possible performance out of the OpenCL kernel and to get the best overall speedup for the entire application. All participants had to use the same machine and the same GPU.
 +And the winners are (...drumroll...):​ **Cristi Alexandru Vasile** and **Costin Giorgian Papuc**! Congratulations!
 +The runner up with very close performance is Alexandru Grad. 
 +Here are the results of our winners:
 +^  Name  ^  Input Size  ^  Overall Speedup ​         ^  Kernel Speedup ​ |
 +|Cristi Alexandru Vasile| 16K | 28.22X |  2.31X  |
 +|Cristi Alexandru Vasile| 64K | 26.16X |  2.29X  |
 +|Cristi Alexandru Vasile| 144K | 25.97X |  2.31X  |
 +|Cristi Alexandru Vasile| 256K | 25.81X |  **2.51X** ​ |
 +|Costin Giorgian Papuc| 16K | **29.18X** |  2.32X  |
 +|Costin Giorgian Papuc| 64K | 26.86X |  2.29X  |
 +|Costin Giorgian Papuc| 144K | 26.98X |  2.29X  |
 +|Costin Giorgian Papuc| 256K | 26.27X |  2.36X  |
 +The overall speedup is measured as the ratio between the execution time of the C implementation and the execution time for the OpenCL implementation.
 +The measured execution time for the OpenCL implementation also includes the time for allocating buffers on the device, transferring the data to the device and back to the host. However, it does not include the time needed for initializing the OpenCL context and building the OpenCL kernel.
 +The C implementation is single threaded and does not make use of SIMD instructions.
sesiuni/opencl.txt · Last modified: 2015/10/03 18:37 by ahamuraru