Vector Add on GPU


Vector Add

In the world of computing, the addition of two vectors is the standard "Hello World".

Given two sets of scalar data, such as the image above, we want to compute the sum, element by element.

We start by implementing the algorithm in plain C#.

Edit the file 01-vector-add.cs and implement this algorithm in plain C# until it displays OK

If you get stuck, you can refer to the solution.


In [ ]:
!hybridizer-cuda ./01-vector-add/01-vector-add.cs -o ./01-vector-add/vectoradd.exe -run

Introduce Parallelism

As we can see in the solution, a plain scalar iterative approach only uses one thread, while modern CPUs have typically 4 cores and 8 threads.

Fortunately, .Net and C# provide an intuitive construct to leverage parallelism : Parallel.For.

Modify 01-vector-add.cs to distribute the work among multiple threads.

If you get stuck, you can refer to the solution.


In [ ]:
!hybridizer-cuda ./01-vector-add/01-vector-add.cs -o ./01-vector-add/parallel-vectoradd.exe -run

Run Code on the GPU

Using Hybridizer to run the above code on a GPU is quite straightforward. We need to

  • Decorate methods we want to run on the GPU
    This is done by adding [EntryPoint] attribute on methods of interest.
  • "Wrap" current object into a dynamic object able to dispatch code on the GPU This is done by the following boilerplate code:
    dynamic wrapped = HybRunner.Cuda().Wrap(new Program());
    wrapped.mymethod(...)
    
    wrapped object has the same methods signatures (static or instance) as the current object, but dispatches calls to GPU.

Modify the 02-vector-add.cs so the Add method runs on a GPU.

If you get stuck, you can refer to the solution.


In [ ]:
!hybridizer-cuda ./02-gpu-vector-add/02-gpu-vector-add.cs -o ./02-gpu-vector-add/gpu-vectoradd.exe -run