Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-dev@hadoop.apache.org
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
From: Apache Wiki <wikidiffs@apache.org>
To: Apache Wiki <wikidiffs@apache.org>
Date: Mon, 14 Mar 2011 16:28:37 -0000
Message-ID: <20110314162837.45543.81314@eosnew.apache.org>
Subject: 
 =?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22CUDA_On_Hadoop=22_by_ChenHe?=

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for ch=
ange notification.

The "CUDA On Hadoop" page has been changed by ChenHe.
http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop?action=3Ddiff&rev1=3D12&re=
v2=3D13

--------------------------------------------------

  The other is little bit tricky. you can manually compile the CUDA code in=
to binary files in advance and move them to tasktrackers working directory.=
 And then every tasktracker can access those compiled binary files.
  =

  =3D=3D=3D For C/C++ programmers =3D=3D=3D
- We employ CUDA SDK programs in our experiments. For CUDA SDK programs, we=
 first digested the code and partitioned the program into portions for data=
 generation, bootstrapping, and CUDA kernels, with the former two component=
s transformed respectively into a standalone data generator and a virtual m=
ethod callable from the map method in our MapRed utility class. The CUDA ke=
rnel is kept as-is since we want to perform the same computation on the GPU=
 only in a distributed fashion. The data generator is augmented with the fe=
ature for taking command-line arguments such that we can specify input size=
s and output location for different experiment runs. We reuse the code for =
boot-strapping a kernel execution into part of the mapper workload, thus pr=
oviding a seamless integration of CUDA and Hadoop. The architecture of the =
ported CUDA SDK programs onto Hadoop is shown in Figure 2. For reusability,=
 we have used object-oriented design by abstracting the mapper and reducer =
functions into a base class, i.e., MapRed. For different computing, we can =
override the following virtual methods defined by MapRed:
+ We employ CUDA SDK programs in our experiments. For CUDA SDK programs, we=
 first digested the code and partitioned the program into portions for data=
 generation, bootstrapping, and CUDA kernels, with the former two component=
s transformed respectively into a standalone data generator and a virtual m=
ethod callable from the map method in our MapRed utility class. The CUDA ke=
rnel is kept as-is since we want to perform the same computation on the GPU=
 only in a distributed fashion. The data generator is augmented with the fe=
ature for taking command-line arguments such that we can specify input size=
s and output location for different experiment runs. We reuse the code for =
boot-strapping a kernel execution into part of the mapper workload, thus pr=
oviding a seamless integration of CUDA and Hadoop. The architecture of the =
ported CUDA SDK programs onto Hadoop is shown in Figure 1. For reusability,=
 we have used object-oriented design by abstracting the mapper and reducer =
functions into a base class, i.e., MapRed. For different computing, we can =
override the following virtual methods defined by MapRed:
  =

  {{{
- =E2=80=A2 void processHadoopData(string& input);
+  void processHadoopData(string& input);
- =E2=80=A2 void cudaCompute(std::map<string,string>& output);
+  void cudaCompute(std::map<string,string>& output);
  }}}
  The processHadoopData method provides a hook for the CUDA program to init=
ialize its internal data structures by parsing the input passed from the HD=
FS. Thereafter, MapRed invokes the cudaCompute method, in which the CUDA ke=
rnel is launched. The results of the computation are stored in the map obje=
ct and sent over to HDFS for reduction.
 =20