Compute large scale of data can be a problem with a gpu, it may not fit on the GPU.
Then in this lab, we will study how to allocate data on the host and use them on the GPU with the sobel filter.
We start the implementation of the filter with a first gpu approach as follow:
[EntryPoint]
public static void ComputeSobel(byte[] outputPixel, byte[] inputPixel, int width, int height)
{
for (int i = threadIdx.y + blockIdx.y * blockDim.y; i < height; i += blockDim.y * gridDim.y)
{
for (int j = threadIdx.x + blockIdx.x * blockDim.x; j < width; j += blockDim.x * gridDim.x)
{
int output = 0;
if (i != 0 && j != 0 && i != height - 1 && j != width - 1)
{
int pixelId = i * width + j;
byte topl = inputPixel[pixelId - width - 1];
byte top = inputPixel[pixelId - width];
byte topr = inputPixel[pixelId - width + 1];
byte l = inputPixel[pixelId - 1];
byte r = inputPixel[pixelId + 1];
byte botl = inputPixel[pixelId + width - 1];
byte bot = inputPixel[pixelId + width];
byte botr = inputPixel[pixelId + width + 1];
int sobelx = (topl) + (2 * l) + (botl) - (topr) - (2 * r) - (botr);
int sobely = (topl + 2 * top + topr - botl - 2 * bot - botr);
int squareSobelx = sobelx * sobelx;
int squareSobely = sobely * sobely;
output = (int)Math.Sqrt((squareSobelx + squareSobely));
if (output < 0)
{
output = -output;
}
if (output > 255)
{
output = 255;
}
outputPixel[pixelId] = (byte)output;
}
}
}
}
this approach has parallelism and as you can see in the source file, the image is load and it's a basic use of the hybridizer. All is copied on the GPU to process it.
In [ ]:
!hybcc 01-parallel-gpu.cs -additionalDotNetAssemblies System.Drawing -o 01/01-parallel-gpu.exe -run
# convert bmp to png to have interactive display
from PIL import Image
img = Image.open('./01/hybrid/lena_highres_sobel.bmp')
img.save('./01/hybrid/lena_highres_sobel.png', 'png')
from IPython.display import Image
Image(filename="./01/hybrid/lena_highres_sobel.png", width=384, height=384)
Now, we want to leave the data on the host instead of copy it on the GPU,then we have to do some changes :
Modify 02-lock-gpu.cs to zero copy the image and process it on the GPU.
If you get stuck, you can refer to the solution.
In [ ]:
!hybcc 02-lock-gpu.cs -additionalDotNetAssemblies System.Drawing -o 02/02-lock-gpu.exe -run
# convert bmp to png to have interactive display
from PIL import Image
img = Image.open('./01/hybrid/lena_highres_sobel.bmp')
img.save('./01/hybrid/lena_highres_sobel.png', 'png')
from IPython.display import Image
Image(filename="./01/hybrid/lena_highres_sobel.png", width=384, height=384)