From Apache Wiki
[Hadoop Wiki] Update of "CUDA On Hadoop" by ChenHe
Date Wed, 16 Mar 2011
  === For C/C++ programmers ===
  We employ CUDA SDK programs in our experiments. For CUDA SDK programs, we first digested
the code and partitioned the program into portions for data generation, bootstrapping, and
CUDA kernels, with the former two components transformed respectively into a standalone data
generator and a virtual method callable from the map method in our MapRed utility class. The
CUDA kernel is kept as-is since we want to perform the same computation on the GPU only in
a distributed fashion. The data generator is augmented with the feature for taking command-line
arguments such that we can specify input sizes and output location for different experiment
runs. We reuse the code for boot-strapping a kernel execution into part of the mapper workload,
thus providing a seamless integration of CUDA and Hadoop. The architecture of the ported CUDA
SDK programs onto Hadoop is shown in Figure 1. For reusability, we have used object-oriented
design by abstracting the mapper and reducer functions into a base class, i.e., MapRed. For
different computing, we can override the following virtual methods defined by MapRed:
+ [[http://cse.unl.edu/~che/images/streaming-2.bmp|Figure 1]]
   void processHadoopData(string& input);

