mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nan Zhu <zhunanmcg...@gmail.com>
Subject Re: Java API for MXNet
Date Wed, 16 Aug 2017 19:05:15 GMT
+1 for Sandeep's suggestion

On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu <javelinjs@gmail.com> wrote:

> Agree with Sandeep, while I guess the performance won't change. But
> yes, benchmark talks.
>
> Moreover, in Scala package we use macros to generate operators
> automatically, which will require more efforts if we switch to pure
> Java.
>
> 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy <
> sandeep.krishna98@gmail.com>:
> > The fastest way to get Java binding is through building Java native
> > wrappers on Scala package.
> > Disadvantages would be:
> >    * *Bloated library size: *May not be suitable for users planning to
> use
> > Java APIs in Android of such smaller systems.
> >    * *Performance:* Performance may not be as good as building directly
> > over JNI and implementing ground up. For example, taking NDArray
> dimensions
> > as Java ArrayList then converting it to Scala Seq to adapt for Scala
> > NDArray API and more such adapters.
> >
> > However, building ground up from JNI would be a huge effort without
> > actually getting feedback from users early.
> >
> > *My Plan:*
> > 1. Build Java interface on top of Scala package.
> > 2. Get early feedback from users. It may turn out Java is not a great
> > candidate for DL training jobs.
> > 3. Solidify the interface (APIs) for Java users.
> > 4. Do performance benchmarks to see Scala Native / Java interface. This
> > gives us comparable numbers on performance in Java.
> > 5. Over a period of time replace underlying Scala usage with JNI base and
> > native Java implementation. Provided feedback from users is positive.
> >
> > Comments/Suggestion?
> >
> > Regards,
> > Sandeep
> >
> >
> > On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu <javelinjs@gmail.com> wrote:
> >
> >> What Nan and I worried about is the re-implementation of something
> >> like https://github.com/apache/incubator-mxnet/blob/master/
> >> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246,
> >> and the executorManager, NDArray, KVStore ... it uses.
> >>
> >> the C API stays at the very low level. If this is the purpose, we can
> >> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile without
> >> scala, no need to introduce JavaCPP. But I don't think this is what
> >> users want.
> >>
> >> 2017-08-17 1:41 GMT+08:00 Joern Kottmann <kottmann@gmail.com>:
> >> > There will be a new scala version one day, and the story we had with
> >> > going from 2.10 to 2.11 might just repeat. In the end if you make a
> >> > dependency using scala you just end up making it for the currently
> >> > popular scala versions. And that might be ok for projects with
> >> > developers who are familiar with these issues, but it is not ok for
> >> > java projects, where people might not expect it or know about these
> >> > problems. It just makes it harder to use.
> >> >
> >> > To me it looks like that the C API is very stable and used by all/most
> >> > other APIs. If we have a Java API - accessing the C API via JavaCPP -
> >> > then we should end up with a pretty stable solution and a lot the code
> >> > that is duplicated with the Scala API is the generated code.
> >> >
> >> > I think we should explore this possible way of implementing it with a
> >> > proof-of-concept.
> >> >
> >> > And if we have a well made Java API it might be something which maybe
> >> > wouldn't need a lot of additions to be pleasurable to use from scala.
> >> >
> >> > Jörn
> >> >
> >> > On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu <zhunanmcgill@gmail.com>
> wrote:
> >> >> I don't think there will be problems under "11", did the user see
> >> concrete
> >> >> errors?
> >> >>
> >> >> Best,
> >> >>
> >> >> Nan
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu <javelinjs@gmail.com>
> wrote:
> >> >>
> >> >>> Hi Nan,
> >> >>>
> >> >>> Users have 2.11, but with a different minor version, will it cause
> >> >>> conflicts?
> >> >>>
> >> >>> 2017-08-17 0:19 GMT+08:00 Nan Zhu <zhunanmcgill@gmail.com>:
> >> >>> > Hi, Yizhi,
> >> >>> >
> >> >>> > You mean users have 2.10 env while we assemble 2.11 in it?
> >> >>> >
> >> >>> > Best,
> >> >>> >
> >> >>> > Nan
> >> >>> >
> >> >>> > On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu <javelinjs@gmail.com>
> >> wrote:
> >> >>> >
> >> >>> >> Hi Joern,
> >> >>> >>
> >> >>> >> The point is that, the front is not a simple wrapper of
c_api.h,
> as
> >> >>> >> you mentioned, which can be easily achieved by JavaCPP.
> >> >>> >>
> >> >>> >> I have noticed the potential conflicts between the assembled
> scala
> >> >>> >> library and the one in users' environment. Can we remove
the
> scala
> >> >>> >> library from the assembly jar? @Nan It wouldn't be a problem
> since
> >> the
> >> >>> >> scala libraries with same major version are compatible.
> >> >>> >>
> >> >>> >> 2017-08-16 23:49 GMT+08:00 Joern Kottmann <kottmann@gmail.com>:
> >> >>> >> > Hello,
> >> >>> >> >
> >> >>> >> > I personally had quite some issues with Scala dependencies
in
> >> >>> >> > different versions and Spark, where one version is
not
> compatible
> >> with
> >> >>> >> > the other version. Then you need to debug the dependency
tree
> to
> >> find
> >> >>> >> > the places where the versions don't match. Every
project which
> >> would
> >> >>> >> > like to use MXnet then has to depend on Scala and
might also
> get
> >> >>> >> > conflicts if other dependencies depend on different
Scala
> >> versions.
> >> >>> >> > Probably something which will cause issues for some
of your
> users.
> >> >>> >> > Users who want to use Java might not be familiar
with Scala
> >> dependency
> >> >>> >> > problems and have a hard time resolving them by getting
strange
> >> error
> >> >>> >> > messages.
> >> >>> >> >
> >> >>> >> > The JNI layer could be generated with JavaCPP, then
we would
> not
> >> need
> >> >>> >> > to write/maintain the C and the  jvm side for that
our self.
> >> >>> >> > A good example of JavaCPP and Scala usage is Apache
Mahout [1].
> >> >>> >> >
> >> >>> >> > Even if we don't use JavaCPP, the JNI layer should
be easy to
> get
> >> into
> >> >>> >> > a state where both can share it, the current Scala
JNI layers
> >> LibInfo
> >> >>> >> > classes could be converted to Java classes and would
in most
> cases
> >> >>> >> > require only minor changes in the Scala code.
> >> >>> >> >
> >> >>> >> > Jörn
> >> >>> >> >
> >> >>> >> > [1] https://github.com/apache/mahout/tree/master/viennacl/
> >> src/main
> >> >>> >> >
> >> >>> >> > On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu <
> zhunanmcgill@gmail.com>
> >> >>> wrote:
> >> >>> >> >> I agree with Yizhi
> >> >>> >> >>
> >> >>> >> >> My major concern is the duplicate implementations,
which are
> >> usually
> >> >>> >> one of
> >> >>> >> >> the major sources of bugs, especially with two
languages which
> >> are
> >> >>> >> >> naturally interactive (OK, Calling Scala from
Java might need
> >> some
> >> >>> more
> >> >>> >> >> efforts). It is just like we provide C++ &
C APIs of MxNet in
> two
> >> >>> >> separated
> >> >>> >> >> packages.
> >> >>> >> >>
> >> >>> >> >> About dependency problem, when you say "As far
as I see this
> has
> >> the
> >> >>> >> great
> >> >>> >> >> disadvantage that the Java API would force Scala
as a
> dependency
> >> onto
> >> >>> >> the
> >> >>> >> >> java users.", would you please give a concrete
example causing
> >> >>> critical
> >> >>> >> >> issues?
> >> >>> >> >>
> >> >>> >> >> Best,
> >> >>> >> >>
> >> >>> >> >> Nan
> >> >>> >> >>
> >> >>> >> >>
> >> >>> >> >>
> >> >>> >> >> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi Liu <
> javelinjs@gmail.com>
> >> >>> wrote:
> >> >>> >> >>
> >> >>> >> >>> Hi,
> >> >>> >> >>>
> >> >>> >> >>> If we build the Java API from the very beginning,
i.e. the
> JNI
> >> part,
> >> >>> >> >>> we have to rewrite the codes for training,
predict,
> inferShape,
> >> etc.
> >> >>> >> >>> It would be too heavy to maintain a totally
new front
> language.
> >> >>> >> >>>
> >> >>> >> >>> As far as I see, I don't think Scala library
dependency would
> >> be a
> >> >>> big
> >> >>> >> >>> problem in most cases, unless we are going
to use it in
> embedded
> >> >>> >> >>> devices. Could you illustrate some use-cases
where you cannot
> >> >>> involve
> >> >>> >> >>> Scala dependencies?
> >> >>> >> >>>
> >> >>> >> >>> 2017-08-16 22:13 GMT+08:00 Joern Kottmann
<
> kottmann@gmail.com>:
> >> >>> >> >>> > Hello,
> >> >>> >> >>> >
> >> >>> >> >>> > the approach which is taken by Spark
is described here [1].
> >> >>> >> >>> >
> >> >>> >> >>> > As far as I see this has the great disadvantage
that the
> Java
> >> API
> >> >>> >> >>> > would force Scala as a dependency onto
the java users.
> >> >>> >> >>> > For a library it is always a great advantage
if it doesn't
> >> have
> >> >>> many
> >> >>> >> >>> > dependencies, or zero dependencies.
In our case it could be
> >> quite
> >> >>> >> >>> > realistic to have a thin wrapper around
the C API without
> >> needing
> >> >>> any
> >> >>> >> >>> > other dependencies (or only dependencies
which can't be
> >> avoided).
> >> >>> >> >>> >
> >> >>> >> >>> > The JNI layer could easily be shared
between the Java and
> >> Scala
> >> >>> API.
> >> >>> >> >>> > As far as I understand is the JNI layer
in the Scala API
> >> anyway
> >> >>> >> >>> > private and a change to it wouldn't
require that the public
> >> part
> >> >>> of
> >> >>> >> >>> > the Scala API is changed.
> >> >>> >> >>> >
> >> >>> >> >>> > What do you think?
> >> >>> >> >>> >
> >> >>> >> >>> > Jörn
> >> >>> >> >>> >
> >> >>> >> >>> > [1] https://cwiki.apache.org/
> confluence/display/SPARK/Java+
> >> >>> >> API+Internals
> >> >>> >> >>> >
> >> >>> >> >>> > On Wed, Aug 16, 2017 at 3:39 PM, YiZhi
Liu <
> >> javelinjs@gmail.com>
> >> >>> >> wrote:
> >> >>> >> >>> >> Hi Joern,
> >> >>> >> >>> >>
> >> >>> >> >>> >> I suggest to build Java API as a
wrapper of Scala API,
> re-use
> >> >>> most
> >> >>> >> of
> >> >>> >> >>> >> the procedures. Referring to the
Java API in Apache Spark.
> >> >>> >> >>> >>
> >> >>> >> >>> >> 2017-08-16 18:21 GMT+08:00 Joern
Kottmann <
> joern@apache.org
> >> >:
> >> >>> >> >>> >>> Hello all,
> >> >>> >> >>> >>>
> >> >>> >> >>> >>> I would like to propose the
addition of a Java API to
> MXNet.
> >> >>> >> >>> >>>
> >> >>> >> >>> >>> There has been some previous
work done for the Scala API,
> >> and it
> >> >>> >> makes
> >> >>> >> >>> >>> sense to at least share the
JNI layer between the two.
> >> >>> >> >>> >>>
> >> >>> >> >>> >>> The Java  API probably should
be aligned with the Python
> API
> >> >>> (and
> >> >>> >> >>> >>> others which exist already)
with a few changes to give
> it a
> >> >>> native
> >> >>> >> >>> >>> Java feel.
> >> >>> >> >>> >>>
> >> >>> >> >>> >>> As far as I understand there
are multiple people
> interested
> >> to
> >> >>> >> work on
> >> >>> >> >>> >>> this and it would be good to
maybe come up with a written
> >> >>> proposal
> >> >>> >> on
> >> >>> >> >>> >>> how things should be.
> >> >>> >> >>> >>>
> >> >>> >> >>> >>> My motivation is to get a Java
API which can be used by
> >> Apache
> >> >>> >> OpenNLP
> >> >>> >> >>> >>> to solve various NLP tasks using
Deep Learning based
> >> approaches
> >> >>> >> and I
> >> >>> >> >>> >>> am also interested to work on
MXNet.
> >> >>> >> >>> >>>
> >> >>> >> >>> >>> Jörn
> >> >>> >> >>> >>
> >> >>> >> >>> >>
> >> >>> >> >>> >>
> >> >>> >> >>> >> --
> >> >>> >> >>> >> Yizhi Liu
> >> >>> >> >>> >> DMLC member
> >> >>> >> >>> >> Technical Manager
> >> >>> >> >>> >> Qihoo 360 Inc, Shanghai, China
> >> >>> >> >>>
> >> >>> >> >>>
> >> >>> >> >>>
> >> >>> >> >>> --
> >> >>> >> >>> Yizhi Liu
> >> >>> >> >>> DMLC member
> >> >>> >> >>> Technical Manager
> >> >>> >> >>> Qihoo 360 Inc, Shanghai, China
> >> >>> >> >>>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> --
> >> >>> >> Yizhi Liu
> >> >>> >> DMLC member
> >> >>> >> Technical Manager
> >> >>> >> Qihoo 360 Inc, Shanghai, China
> >> >>> >>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Yizhi Liu
> >> >>> DMLC member
> >> >>> Technical Manager
> >> >>> Qihoo 360 Inc, Shanghai, China
> >> >>>
> >>
> >>
> >>
> >> --
> >> Yizhi Liu
> >> DMLC member
> >> Technical Manager
> >> Qihoo 360 Inc, Shanghai, China
> >>
> >
> >
> >
> > --
> > Sandeep Krishnamurthy
>
>
>
> --
> Yizhi Liu
> DMLC member
> Technical Manager
> Qihoo 360 Inc, Shanghai, China
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message