hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Illecker <millec...@apache.org>
Subject Re: HybridBSP (CPU and GPU) Task Integration
Date Sun, 25 Aug 2013 14:08:22 GMT
Thanks, your picture [1] illustrates this scenario very good!

In short I have to modify the runBSP in BSPTask, check if the submitted
task extends HybridBSP.
If so, start a PipesBSP server and wait for incoming connections. And run
the bspGpu method within
the HybridBSP task.

Regarding to scheduling:

1) I have to decide within the runBSP should I execute the bspGpu or
default bsp method
of HybridBSP.
e.g., having numTaskBsp set to 8, Hama will start 8 separate Java threads
If I set an additional conf numTaskBspGpu property to 1, I want to have 9
bsp tasks.
(I don't know where these bsp threads are started. Add property check
for numTaskBspGpu and start more bsp tasks.)
8 tasks should execute the default bsp method within runBSP and only one
task should run bspGpu.

2) It should be possible to schedule input data for bsp tasks. (belongs to
the partitioning job)
e.g, having 8 cpu bsp tasks and 1 gpu bsp task, I wish to have a property
to control which amount of input belongs to
which task. Default: Hama's partinioning job will divide the input data
(e.g., sequence file) by the number of tasks?
It might happen that e.g., 80% of input data should go to gpu task and only
20% to cpu tasks.

By the way do you think a HybridBSP based task which extends BSP will work
on Hama without any changes.
Normally it should work because of inheritance of BSP.

Thanks!

Martin

[1] http://i.imgur.com/RP3ETBW.png

2013/8/24 Chia-Hung Lin <clin4j@googlemail.com>

> It seems to me that an additional process or thread will be launched
> for running a GPU-based bsp task, which will then communicate with
> PipesBSP process, as [1]. Please correct me if it is wrong.
>
> If this is the case, BSPTask looks like the place to work on. When
> BSPTask process is running, it can check (e.g. in runBSP) if
> additional GPU process/ thread is needed to be created; then launch/
> destroy such task accordingly.
>
> By the way, it is mentioned that scheduling is needed. Can you please
> give a bit more detail on what kind of scheduling is required?
>
> [1]. http://i.imgur.com/RP3ETBW.png
>
>
> On 24 August 2013 00:59, Martin Illecker <martin@illecker.at> wrote:
> >>
> >> What's the difference between launching `bsp task' and `gpu bsp task'?
> >> Will gpu bsp task fork and execute c/ c++ process?
> >
> >
> > The GPU bsp task can also be executed within a Java process.
> >
> > In detail I want to run a Rootbeer Kernel (e.g., PiEstimationKernel [1])
> > within the bspGpu method.
> > A Rootbeer Kernel is written in Java and converted to CUDA. (the entry
> > point is the gpuMethod)
> > Finally there is a Java wrapper around the CUDA code, so it can be
> invoked
> > within the JVM.
> >
> > So far there is no difference between a normal bsp task execution but I
> > want to use Hama Pipes to communicate via sockets.
> > The GPU bsp task should start like the default one but I will have to
> > establish the Pipes Server for communication.
> > And of course I need scheduling for theses GPU and CPU tasks.
> >
> > I hope the following source will illustrate my scenario better:
> >
> > public class MyHybridBSP extends
> >      HybridBSP<NullWritable, NullWritable, NullWritable, NullWritable,
> > Text> {
> >
> >    @Override
> >    public void bsp(BSPPeer<NullWritable, NullWritable, NullWritable,
> > NullWritable, Text> peer)
> >        throws IOException, SyncException, InterruptedException {
> >
> >      MyGPUKernel kernel = new MyGPUKernel();
> >      Rootbeer rootbeer = new Rootbeer();
> >      rootbeer.setThreadConfig(BLOCK_SIZE, GRID_SIZE,
> BLOCK_SIZE*GRID_SIZE);
> >
> >
> >      // Run GPU Kernels
> >      rootbeer.runAll(kernel);
> >    }
> >
> >    @Override
> >    public void bspGpu(BSPPeer<NullWritable, NullWritable, NullWritable,
> > NullWritable, Text> peer)
> >        throws IOException, SyncException, InterruptedException {
> >
> >      // process algorithm on CPU
> >    }
> >
> >    class MyGPUKernel implements Kernel {
> >      public PiEstimatorKernel() { }
> >
> >      public void gpuMethod() {
> >        // process algorithm on GPU
> >
> >        // the following commands will need Hama Pipes
> >        HamaPeer.getConfiguration();
> >        HamaPeer.readNext(...,...);
> >        // and others....
> >      }
> > }
> >
> > Thanks!
> >
> > Martin
> >
> > [1]
> >
> https://github.com/millecker/applications/blob/master/hama/rootbeer/piestimator/src/at/illecker/hama/rootbeer/examples/piestimator/gpu/PiEstimatorKernel.java
> >
> > 2013/8/23 Chia-Hung Lin <clin4j@googlemail.com>
> >
> >> What's the difference between launching `bsp task' and `gpu bsp task'?
> >> Will gpu bsp task fork and execute c/ c++ process?
> >>
> >> It might be good to distinguish how gpu bsp task will be executed,
> >> then deciding how to launch such task.
> >>
> >> Basically for launching a bsp task, an external process is created.
> >> The logic to execute BSP.bsp() is at
> >>
> >>     BSPTask.java
> >>
> >> where the method
> >>
> >>     runBSP()
> >>
> >> is called with a BSP implementation class loaded at runtime
> >>
> >>     Class<?> workClass =
> job.getConfiguration().getClass("bsp.work.class",
> >>         BSP.class);
> >>
> >> and then the bsp method is executed
> >>
> >>     bsp.bsp(bspPeer);
> >>
> >>
> >>
> >>
> >>
> >>
> >> On 23 August 2013 21:45, Martin Illecker <martin@illecker.at> wrote:
> >> > Hi,
> >> >
> >> > I have created a HybridBSP [1] class which should combine the default
> BSP
> >> > (CPU) class with GPU methods [2].
> >> >
> >> > The abstract HybridBSP class extends the BSP class and adds bspGpu,
> >> > setupGpu and cleanupGpu method.
> >> >
> >> > public abstract class HybridBSP<K1, V1, K2, V2, M extends Writable>
> >> extends
> >> >     BSP<K1, V1, K2, V2, M> implements BSPGpuInterface<K1, V1,
K2, V2,
> M>
> >> {
> >> >
> >> >   @Override
> >> >   public abstract void bspGpu(BSPPeer<K1, V1, K2, V2, M> peer)
> >> >       throws IOException, SyncException, InterruptedException;
> >> >
> >> >   @Override
> >> >   public void setupGpu(BSPPeer<K1, V1, K2, V2, M> peer) throws
> >> IOException,
> >> >       SyncException, InterruptedException {
> >> >   }
> >> >
> >> >   @Override
> >> >   public void cleanupGpu(BSPPeer<K1, V1, K2, V2, M> peer) throws
> >> IOException {
> >> >   }
> >> > }
> >> >
> >> >
> >> > Now I want to add a new scheduling technique which checks the conf
> >> property
> >> > (gpuBspTaskNum) and executes the bspGpu instead of default bsp method.
> >> >
> >> > e.g., bspTaskNum=3 and gpuBspTaskNum=1
> >> > The scheduler should run four bsp tasks simultaneously and execute
> three
> >> > times the bsp method and once the bspGpu. (both defined within one
> >> derived
> >> > HybridBSP class)
> >> >
> >> > Do I have to modify the taskrunner or create a new
> SimpleTaskScheduler?
> >> >
> >> > How can I integrate this into Hama?
> >> >
> >> > Thanks!
> >> >
> >> > Martin
> >> >
> >> > [1]
> >> >
> >>
> https://github.com/millecker/hama/blob/5d0e8b26abd6b63fa5afad09a2ba960bf9922868/core/src/main/java/org/apache/hama/bsp/gpu/HybridBSP.java
> >> > [2]
> >> >
> >>
> https://github.com/millecker/hama/blob/5d0e8b26abd6b63fa5afad09a2ba960bf9922868/core/src/main/java/org/apache/hama/bsp/gpu/BSPGpuInterface.java
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message