Mailing-List: contact dev-help@horn.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@horn.incubator.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAGQgZQTbi8t5=r3zgQcB5mqok7LLDu7dO-g9G6RYM4FqXPPRDg@mail.gmail.com>
References: 
 <CAOAU8A-XQW1SKxapDKFtUW2su=g9EAVTAV+ROq2U3jpr4yahJg@mail.gmail.com>
	<CAGQgZQR8WuEqCWojbxS6MjCBfeO33HZYuxCpCkDbzGUUHOkNMQ@mail.gmail.com>
	<CAOAU8A_YRsdq-a9BKgxyz4uSSfepPZ_EJF5GYVo4hKAeJnvaKA@mail.gmail.com>
	<CAGQgZQTbi8t5=r3zgQcB5mqok7LLDu7dO-g9G6RYM4FqXPPRDg@mail.gmail.com>
Date: Thu, 29 Oct 2015 13:51:05 +0900
Message-ID: 
 <CAOAU8A8mb-DM7wVD-ZEzAczNazgThJ0K45VssAy9A8EiG=NMUA@mail.gmail.com>
Subject: Re: Java On GPU: Rootbear, Java 8
From: Shubham Mehta <shubham.mehta93@gmail.com>
To: dev@horn.incubator.apache.org
Content-Type: multipart/alternative; boundary=001a114e43e00a1d000523370fb9

--001a114e43e00a1d000523370fb9
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

That's really a good point. I also guess that we will need to provide
mathematical functions, so that Users can defined there neuron object using
these functions.

Does anyone happen to know any project where GPU enable/disable
functionality was given before?

On Thu, Oct 29, 2015 at 1:16 PM, Edward J. Yoon <edwardyoon@apache.org>
wrote:

> We also need to think whether hiding GPU acceleration code is possible.
>
> To support enable/disable GPU acceleration without code changing from
> user-side, we maybe should not allow user to write a formula for some
> calculations directly within neuron-centric programming model if we
> can't accelerate the calculations of (user-defined) neuron objects
> directly on GPU.
>
> I roughly guess we will need to provide arithmetic operation methods
> like functional lang e.g., multiply(), ..., etc. Then, we can generate
> the code for GPU internally.
>
>
> On Thu, Oct 29, 2015 at 9:20 AM, Shubham Mehta
> <shubham.mehta93@gmail.com> wrote:
> > GPU programming can be done two ways:
> > -> CUDA: It has very limited vendors
> > -> OpenCl: It is much more portable
> >
> > A comparative study between *CUDA* and *OpenCL*:
> > http://wiki.tiker.net/CudaVsOpenCL
> >
> > *Conclusion: *I personally think that we should go with OpenCL. Though,
> we
> > lose a bit on performance.
> >
> > If we go by OpenCL then rootbeer option is ruled out. Also, it is not
> > maintained at present. I suggest that we should go for *Aparapi*(
> > https://github.com/aparapi/aparapi) which is also open source and well
> > maintained. *Aparapi *is based on OpenCL.
> >
> > A comparative study between *JCuda *and* Aparapi*:
> > http://www.des.udc.es/~juan/papers/eoops2013.pdf
> > Its comnclusion says that JCuda requires more programming effort but ca=
n
> > give better performance when used with optimized CUDA libraries (CUFFT
> and
> > CUBLABS)
> >
> >
> >
> >
> >
> > On Wed, Oct 28, 2015 at 1:30 PM, Edward J. Yoon <edwardyoon@apache.org>
> > wrote:
> >
> >> Cool, I can look at about GC problem of rootbeer and Java8 closely.
> >>
> >> On Wednesday, 28 October 2015, Shubham Mehta <shubham.mehta93@gmail.co=
m
> >
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > As you know Edward suggested that we can go by Rootbear(
> >> >
> >> >
> >>
> https://raw.githubusercontent.com/pcpratts/rootbeer1/master/doc/rootbeer1=
_paper.pdf
> >> .
> >> > ).
> >> > So, I was going through their paper regarding how they supported GPU
> >> > acceleration for Java.
> >> >
> >> > The main idea is cross compilation of Java bytecode to CUDA. They ha=
ve
> >> > tried to get highest performance in (de)serialization to and from GP=
U
> >> > memory by comparing few approaches: JNI, Pure Java and Reflection.
> >> Finally,
> >> > they used Pure Java to read everything into Java byte array. After
> that,
> >> > one JNI call to copy everything to GPU memory.
> >> >
> >> > Each Java Object is represented in GPU memory as two segments-Static
> mem.
> >> > and Instance mem. They mostly haven't done Garbage Collection on GPU=
,
> >> which
> >> > I think shouldn't bother us.
> >> >
> >> > For code conversion to CUDA, they use *Soot Optimization Framework*
> (Raja
> >> > Vall=C3=A9e-Rai, Laurie Hendren, Vijay Sundaresan, Patrick Lam, Etie=
nne
> Gagnon
> >> > and Phong Co. Soot - a Java Optimization Framework) to load .class
> files
> >> > from jar to an intermediate in memory representation called Jimple
> which
> >> is
> >> > then analyzed to generate CUDA code.
> >> >
> >> > *In short, it will server our purpose to run computation support GPU=
s
> for
> >> > neuron-centric programming model.*
> >> >
> >> > But before taking final decision, it would be great if someone can
> look
> >> > into the following paper:
> >> > "*Evaluation of Java for General Purpose GPU Computing "*
> >> > This paper cites Rootbear and has done comparative study of all
> available
> >> > framework for running Java code on GPGPU. I couldn't access it.
> >> >
> >> > Also, I read that Java 8 provides parallel stream APIs with lambda
> >> > expressions to facilitate parallel programming for multi-core CPUs a=
nd
> >> > many-code GPUs. Does anyone know about this in more detail?
> >> >
> >> > Lastly, people are trying to use ML to decide between GPU and CPU
> during
> >> > runtime. I don't know how well it scales. It is very recent study.
> >> > "*Machine-Learning-based Performance Heuristics for Runtime CPU/GPU
> >> > Selection*"
> >> >
> >> > Best Regards,
> >> > Shubham
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Shubham Mehta
> >> > B.Tech 2015
> >> > Computer Science and Engineering
> >> > IIT Bombay
> >> >
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >>
> >
> >
> >
> > --
> > Shubham Mehta
> > B.Tech 2015
> > Computer Science and Engineering
> > IIT Bombay
>
>
>
> --
> Best Regards, Edward J. Yoon
>


--=20

Shubham Mehta
Software Engineer
Samsung Electronics
Suwon, South Korea

--001a114e43e00a1d000523370fb9--