mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From YiZhi Liu <javeli...@gmail.com>
Subject Re: Java API for MXNet
Date Wed, 16 Aug 2017 18:21:39 GMT
Agree with Sandeep, while I guess the performance won't change. But
yes, benchmark talks.

Moreover, in Scala package we use macros to generate operators
automatically, which will require more efforts if we switch to pure
Java.

2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy <sandeep.krishna98@gmail.com>:
> The fastest way to get Java binding is through building Java native
> wrappers on Scala package.
> Disadvantages would be:
>    * *Bloated library size: *May not be suitable for users planning to use
> Java APIs in Android of such smaller systems.
>    * *Performance:* Performance may not be as good as building directly
> over JNI and implementing ground up. For example, taking NDArray dimensions
> as Java ArrayList then converting it to Scala Seq to adapt for Scala
> NDArray API and more such adapters.
>
> However, building ground up from JNI would be a huge effort without
> actually getting feedback from users early.
>
> *My Plan:*
> 1. Build Java interface on top of Scala package.
> 2. Get early feedback from users. It may turn out Java is not a great
> candidate for DL training jobs.
> 3. Solidify the interface (APIs) for Java users.
> 4. Do performance benchmarks to see Scala Native / Java interface. This
> gives us comparable numbers on performance in Java.
> 5. Over a period of time replace underlying Scala usage with JNI base and
> native Java implementation. Provided feedback from users is positive.
>
> Comments/Suggestion?
>
> Regards,
> Sandeep
>
>
> On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu <javelinjs@gmail.com> wrote:
>
>> What Nan and I worried about is the re-implementation of something
>> like https://github.com/apache/incubator-mxnet/blob/master/
>> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246,
>> and the executorManager, NDArray, KVStore ... it uses.
>>
>> the C API stays at the very low level. If this is the purpose, we can
>> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile without
>> scala, no need to introduce JavaCPP. But I don't think this is what
>> users want.
>>
>> 2017-08-17 1:41 GMT+08:00 Joern Kottmann <kottmann@gmail.com>:
>> > There will be a new scala version one day, and the story we had with
>> > going from 2.10 to 2.11 might just repeat. In the end if you make a
>> > dependency using scala you just end up making it for the currently
>> > popular scala versions. And that might be ok for projects with
>> > developers who are familiar with these issues, but it is not ok for
>> > java projects, where people might not expect it or know about these
>> > problems. It just makes it harder to use.
>> >
>> > To me it looks like that the C API is very stable and used by all/most
>> > other APIs. If we have a Java API - accessing the C API via JavaCPP -
>> > then we should end up with a pretty stable solution and a lot the code
>> > that is duplicated with the Scala API is the generated code.
>> >
>> > I think we should explore this possible way of implementing it with a
>> > proof-of-concept.
>> >
>> > And if we have a well made Java API it might be something which maybe
>> > wouldn't need a lot of additions to be pleasurable to use from scala.
>> >
>> > Jörn
>> >
>> > On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu <zhunanmcgill@gmail.com> wrote:
>> >> I don't think there will be problems under "11", did the user see
>> concrete
>> >> errors?
>> >>
>> >> Best,
>> >>
>> >> Nan
>> >>
>> >>
>> >>
>> >> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu <javelinjs@gmail.com> wrote:
>> >>
>> >>> Hi Nan,
>> >>>
>> >>> Users have 2.11, but with a different minor version, will it cause
>> >>> conflicts?
>> >>>
>> >>> 2017-08-17 0:19 GMT+08:00 Nan Zhu <zhunanmcgill@gmail.com>:
>> >>> > Hi, Yizhi,
>> >>> >
>> >>> > You mean users have 2.10 env while we assemble 2.11 in it?
>> >>> >
>> >>> > Best,
>> >>> >
>> >>> > Nan
>> >>> >
>> >>> > On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu <javelinjs@gmail.com>
>> wrote:
>> >>> >
>> >>> >> Hi Joern,
>> >>> >>
>> >>> >> The point is that, the front is not a simple wrapper of c_api.h,
as
>> >>> >> you mentioned, which can be easily achieved by JavaCPP.
>> >>> >>
>> >>> >> I have noticed the potential conflicts between the assembled
scala
>> >>> >> library and the one in users' environment. Can we remove the
scala
>> >>> >> library from the assembly jar? @Nan It wouldn't be a problem
since
>> the
>> >>> >> scala libraries with same major version are compatible.
>> >>> >>
>> >>> >> 2017-08-16 23:49 GMT+08:00 Joern Kottmann <kottmann@gmail.com>:
>> >>> >> > Hello,
>> >>> >> >
>> >>> >> > I personally had quite some issues with Scala dependencies
in
>> >>> >> > different versions and Spark, where one version is not
compatible
>> with
>> >>> >> > the other version. Then you need to debug the dependency
tree to
>> find
>> >>> >> > the places where the versions don't match. Every project
which
>> would
>> >>> >> > like to use MXnet then has to depend on Scala and might
also get
>> >>> >> > conflicts if other dependencies depend on different Scala
>> versions.
>> >>> >> > Probably something which will cause issues for some of
your users.
>> >>> >> > Users who want to use Java might not be familiar with
Scala
>> dependency
>> >>> >> > problems and have a hard time resolving them by getting
strange
>> error
>> >>> >> > messages.
>> >>> >> >
>> >>> >> > The JNI layer could be generated with JavaCPP, then we
would not
>> need
>> >>> >> > to write/maintain the C and the  jvm side for that our
self.
>> >>> >> > A good example of JavaCPP and Scala usage is Apache Mahout
[1].
>> >>> >> >
>> >>> >> > Even if we don't use JavaCPP, the JNI layer should be
easy to get
>> into
>> >>> >> > a state where both can share it, the current Scala JNI
layers
>> LibInfo
>> >>> >> > classes could be converted to Java classes and would in
most cases
>> >>> >> > require only minor changes in the Scala code.
>> >>> >> >
>> >>> >> > Jörn
>> >>> >> >
>> >>> >> > [1] https://github.com/apache/mahout/tree/master/viennacl/
>> src/main
>> >>> >> >
>> >>> >> > On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu <zhunanmcgill@gmail.com>
>> >>> wrote:
>> >>> >> >> I agree with Yizhi
>> >>> >> >>
>> >>> >> >> My major concern is the duplicate implementations,
which are
>> usually
>> >>> >> one of
>> >>> >> >> the major sources of bugs, especially with two languages
which
>> are
>> >>> >> >> naturally interactive (OK, Calling Scala from Java
might need
>> some
>> >>> more
>> >>> >> >> efforts). It is just like we provide C++ & C APIs
of MxNet in two
>> >>> >> separated
>> >>> >> >> packages.
>> >>> >> >>
>> >>> >> >> About dependency problem, when you say "As far as
I see this has
>> the
>> >>> >> great
>> >>> >> >> disadvantage that the Java API would force Scala as
a dependency
>> onto
>> >>> >> the
>> >>> >> >> java users.", would you please give a concrete example
causing
>> >>> critical
>> >>> >> >> issues?
>> >>> >> >>
>> >>> >> >> Best,
>> >>> >> >>
>> >>> >> >> Nan
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi Liu <javelinjs@gmail.com>
>> >>> wrote:
>> >>> >> >>
>> >>> >> >>> Hi,
>> >>> >> >>>
>> >>> >> >>> If we build the Java API from the very beginning,
i.e. the JNI
>> part,
>> >>> >> >>> we have to rewrite the codes for training, predict,
inferShape,
>> etc.
>> >>> >> >>> It would be too heavy to maintain a totally new
front language.
>> >>> >> >>>
>> >>> >> >>> As far as I see, I don't think Scala library dependency
would
>> be a
>> >>> big
>> >>> >> >>> problem in most cases, unless we are going to
use it in embedded
>> >>> >> >>> devices. Could you illustrate some use-cases where
you cannot
>> >>> involve
>> >>> >> >>> Scala dependencies?
>> >>> >> >>>
>> >>> >> >>> 2017-08-16 22:13 GMT+08:00 Joern Kottmann <kottmann@gmail.com>:
>> >>> >> >>> > Hello,
>> >>> >> >>> >
>> >>> >> >>> > the approach which is taken by Spark is described
here [1].
>> >>> >> >>> >
>> >>> >> >>> > As far as I see this has the great disadvantage
that the Java
>> API
>> >>> >> >>> > would force Scala as a dependency onto the
java users.
>> >>> >> >>> > For a library it is always a great advantage
if it doesn't
>> have
>> >>> many
>> >>> >> >>> > dependencies, or zero dependencies. In our
case it could be
>> quite
>> >>> >> >>> > realistic to have a thin wrapper around the
C API without
>> needing
>> >>> any
>> >>> >> >>> > other dependencies (or only dependencies
which can't be
>> avoided).
>> >>> >> >>> >
>> >>> >> >>> > The JNI layer could easily be shared between
the Java and
>> Scala
>> >>> API.
>> >>> >> >>> > As far as I understand is the JNI layer in
the Scala API
>> anyway
>> >>> >> >>> > private and a change to it wouldn't require
that the public
>> part
>> >>> of
>> >>> >> >>> > the Scala API is changed.
>> >>> >> >>> >
>> >>> >> >>> > What do you think?
>> >>> >> >>> >
>> >>> >> >>> > Jörn
>> >>> >> >>> >
>> >>> >> >>> > [1] https://cwiki.apache.org/confluence/display/SPARK/Java+
>> >>> >> API+Internals
>> >>> >> >>> >
>> >>> >> >>> > On Wed, Aug 16, 2017 at 3:39 PM, YiZhi Liu
<
>> javelinjs@gmail.com>
>> >>> >> wrote:
>> >>> >> >>> >> Hi Joern,
>> >>> >> >>> >>
>> >>> >> >>> >> I suggest to build Java API as a wrapper
of Scala API, re-use
>> >>> most
>> >>> >> of
>> >>> >> >>> >> the procedures. Referring to the Java
API in Apache Spark.
>> >>> >> >>> >>
>> >>> >> >>> >> 2017-08-16 18:21 GMT+08:00 Joern Kottmann
<joern@apache.org
>> >:
>> >>> >> >>> >>> Hello all,
>> >>> >> >>> >>>
>> >>> >> >>> >>> I would like to propose the addition
of a Java API to MXNet.
>> >>> >> >>> >>>
>> >>> >> >>> >>> There has been some previous work
done for the Scala API,
>> and it
>> >>> >> makes
>> >>> >> >>> >>> sense to at least share the JNI layer
between the two.
>> >>> >> >>> >>>
>> >>> >> >>> >>> The Java  API probably should be
aligned with the Python API
>> >>> (and
>> >>> >> >>> >>> others which exist already) with
a few changes to give it a
>> >>> native
>> >>> >> >>> >>> Java feel.
>> >>> >> >>> >>>
>> >>> >> >>> >>> As far as I understand there are
multiple people interested
>> to
>> >>> >> work on
>> >>> >> >>> >>> this and it would be good to maybe
come up with a written
>> >>> proposal
>> >>> >> on
>> >>> >> >>> >>> how things should be.
>> >>> >> >>> >>>
>> >>> >> >>> >>> My motivation is to get a Java API
which can be used by
>> Apache
>> >>> >> OpenNLP
>> >>> >> >>> >>> to solve various NLP tasks using
Deep Learning based
>> approaches
>> >>> >> and I
>> >>> >> >>> >>> am also interested to work on MXNet.
>> >>> >> >>> >>>
>> >>> >> >>> >>> Jörn
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >> --
>> >>> >> >>> >> Yizhi Liu
>> >>> >> >>> >> DMLC member
>> >>> >> >>> >> Technical Manager
>> >>> >> >>> >> Qihoo 360 Inc, Shanghai, China
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> --
>> >>> >> >>> Yizhi Liu
>> >>> >> >>> DMLC member
>> >>> >> >>> Technical Manager
>> >>> >> >>> Qihoo 360 Inc, Shanghai, China
>> >>> >> >>>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Yizhi Liu
>> >>> >> DMLC member
>> >>> >> Technical Manager
>> >>> >> Qihoo 360 Inc, Shanghai, China
>> >>> >>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Yizhi Liu
>> >>> DMLC member
>> >>> Technical Manager
>> >>> Qihoo 360 Inc, Shanghai, China
>> >>>
>>
>>
>>
>> --
>> Yizhi Liu
>> DMLC member
>> Technical Manager
>> Qihoo 360 Inc, Shanghai, China
>>
>
>
>
> --
> Sandeep Krishnamurthy



-- 
Yizhi Liu
DMLC member
Technical Manager
Qihoo 360 Inc, Shanghai, China

Mime
View raw message