Stream Compaction - Introduction^{1)}Markus Billeter et al. Efficient Stream Compaction on Wide SIMD Many-Core Architectures^{2)}InK-Compact-: In kernel Stream Compaction and Its Application to Multi-kernel Data Visualization on GPGPU- D.M. Hughes

An efficient implementation of stream compaction on GPU is presented with code example. Full source code on github.

Stream compaction/reduction is commonly referred as the operation of removing unwanted elements in a collection. More formally, imagine we have a list of element of N elements and a predicate that bisects in wanted and unwanted elements (ones that satisfy the predicates and ones that don't). The stream compaction of under is an Read On…

This article present a CUDA parallel code for the generation of the famous Julia Set. Informally a point of the complex plane belongs to the set if given a function f(z) the serie does not tend to infinity. Different function and different initial condition give raise eventually to fractals. One of the most famous serie is (the one that generated the video below).
Some nice picture may be obtained with the following initial conditions:

# dentrite fractal

# douady's rabbit fractal

# san marco fractal

# siegel disk fractal

# NEAT cauliflower thingy

# galaxies

# groovy

# frost

Here a video, showing a sequence of picture generated using different initial conditions. A nice example of how math and computer science can produce art!

CUDA Julia set code

The code is such that it is very easy to change the function and the initial condition (just edit the device function functor).

The code above are the GPU instructions that are devoted to the computation of a single set. The kernel computeJulia() is executed on DIMX*DIMY (the dimension of the image) threads that indipendently compute the evolveComplexPoint() function on the converted to complex corrensponding pixel indices, the viewport (each threads compute a single pixel that has integer coordinates). The evolveComplexPoint() take care of checking how fast che point diverges.

The code is available here on github. It generates a number of png tha can be then merged in a video (as the one I present here) using a command line tool as ffmpeg.

In order to compile and run it need the CUDA SDK,a CUDA capable device and libpng installed on your system.
To compile and/or generate the video simply hit: