mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zha, Sheng" <zhash...@amazon.com>
Subject Re: Java API for MXNet
Date Wed, 16 Aug 2017 20:24:51 GMT
It would be great if there could be some examples for the benefits of the proposed API. I understand
that certain syntax may look unsatisfying when calling scala API from java, and the questions
that should be asked are: 1. Whether it’s intolerable to the extent that a new language
binding must be added. 2. Is this something that can be fixed through simpler methods. It
wouldn’t be good investment of time if the gain is only marginal.

Best regards,
-sz

On 8/16/17, 12:46 PM, "Joern Kottmann" <kottmann@gmail.com> wrote:

    Seems like we are all agree about the idea to add a Java API.
    
    Maybe it is just me, but it wouldn't at all make sense for me (OpenNLP
    use case) to use the Java API when it requires a Scala dependency,
    because at that point I would be better of just using the Scala API,
    and ensure that the things I build are compatible with Java.
    
    So if I don't want to add Scala as a dependency then I am better off
    building something on top of a generated JNI layer. As far as I can
    tell from my tests with the scala-package you can get quite far with
    MXNet using NDArray and the Symbol API.
    
    Maybe we could work on this from two sides as described by Pracheer.
    If we have a well defined Java API you could look at the work I have
    done by then and see how it can be plugged in or what can be learnt
    from it.
    
    Jörn
    
    On Wed, Aug 16, 2017 at 9:05 PM, Nan Zhu <zhunanmcgill@gmail.com> wrote:
    > +1 for Sandeep's suggestion
    >
    > On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu <javelinjs@gmail.com> wrote:
    >
    >> Agree with Sandeep, while I guess the performance won't change. But
    >> yes, benchmark talks.
    >>
    >> Moreover, in Scala package we use macros to generate operators
    >> automatically, which will require more efforts if we switch to pure
    >> Java.
    >>
    >> 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy <
    >> sandeep.krishna98@gmail.com>:
    >> > The fastest way to get Java binding is through building Java native
    >> > wrappers on Scala package.
    >> > Disadvantages would be:
    >> >    * *Bloated library size: *May not be suitable for users planning to
    >> use
    >> > Java APIs in Android of such smaller systems.
    >> >    * *Performance:* Performance may not be as good as building directly
    >> > over JNI and implementing ground up. For example, taking NDArray
    >> dimensions
    >> > as Java ArrayList then converting it to Scala Seq to adapt for Scala
    >> > NDArray API and more such adapters.
    >> >
    >> > However, building ground up from JNI would be a huge effort without
    >> > actually getting feedback from users early.
    >> >
    >> > *My Plan:*
    >> > 1. Build Java interface on top of Scala package.
    >> > 2. Get early feedback from users. It may turn out Java is not a great
    >> > candidate for DL training jobs.
    >> > 3. Solidify the interface (APIs) for Java users.
    >> > 4. Do performance benchmarks to see Scala Native / Java interface. This
    >> > gives us comparable numbers on performance in Java.
    >> > 5. Over a period of time replace underlying Scala usage with JNI base and
    >> > native Java implementation. Provided feedback from users is positive.
    >> >
    >> > Comments/Suggestion?
    >> >
    >> > Regards,
    >> > Sandeep
    >> >
    >> >
    >> > On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu <javelinjs@gmail.com>
wrote:
    >> >
    >> >> What Nan and I worried about is the re-implementation of something
    >> >> like https://github.com/apache/incubator-mxnet/blob/master/
    >> >> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246,
    >> >> and the executorManager, NDArray, KVStore ... it uses.
    >> >>
    >> >> the C API stays at the very low level. If this is the purpose, we can
    >> >> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile without
    >> >> scala, no need to introduce JavaCPP. But I don't think this is what
    >> >> users want.
    >> >>
    >> >> 2017-08-17 1:41 GMT+08:00 Joern Kottmann <kottmann@gmail.com>:
    >> >> > There will be a new scala version one day, and the story we had
with
    >> >> > going from 2.10 to 2.11 might just repeat. In the end if you make
a
    >> >> > dependency using scala you just end up making it for the currently
    >> >> > popular scala versions. And that might be ok for projects with
    >> >> > developers who are familiar with these issues, but it is not ok
for
    >> >> > java projects, where people might not expect it or know about these
    >> >> > problems. It just makes it harder to use.
    >> >> >
    >> >> > To me it looks like that the C API is very stable and used by all/most
    >> >> > other APIs. If we have a Java API - accessing the C API via JavaCPP
-
    >> >> > then we should end up with a pretty stable solution and a lot the
code
    >> >> > that is duplicated with the Scala API is the generated code.
    >> >> >
    >> >> > I think we should explore this possible way of implementing it
with a
    >> >> > proof-of-concept.
    >> >> >
    >> >> > And if we have a well made Java API it might be something which
maybe
    >> >> > wouldn't need a lot of additions to be pleasurable to use from
scala.
    >> >> >
    >> >> > Jörn
    >> >> >
    >> >> > On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu <zhunanmcgill@gmail.com>
    >> wrote:
    >> >> >> I don't think there will be problems under "11", did the user
see
    >> >> concrete
    >> >> >> errors?
    >> >> >>
    >> >> >> Best,
    >> >> >>
    >> >> >> Nan
    >> >> >>
    >> >> >>
    >> >> >>
    >> >> >> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu <javelinjs@gmail.com>
    >> wrote:
    >> >> >>
    >> >> >>> Hi Nan,
    >> >> >>>
    >> >> >>> Users have 2.11, but with a different minor version, will
it cause
    >> >> >>> conflicts?
    >> >> >>>
    >> >> >>> 2017-08-17 0:19 GMT+08:00 Nan Zhu <zhunanmcgill@gmail.com>:
    >> >> >>> > Hi, Yizhi,
    >> >> >>> >
    >> >> >>> > You mean users have 2.10 env while we assemble 2.11
in it?
    >> >> >>> >
    >> >> >>> > Best,
    >> >> >>> >
    >> >> >>> > Nan
    >> >> >>> >
    >> >> >>> > On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu <javelinjs@gmail.com>
    >> >> wrote:
    >> >> >>> >
    >> >> >>> >> Hi Joern,
    >> >> >>> >>
    >> >> >>> >> The point is that, the front is not a simple wrapper
of c_api.h,
    >> as
    >> >> >>> >> you mentioned, which can be easily achieved by
JavaCPP.
    >> >> >>> >>
    >> >> >>> >> I have noticed the potential conflicts between
the assembled
    >> scala
    >> >> >>> >> library and the one in users' environment. Can
we remove the
    >> scala
    >> >> >>> >> library from the assembly jar? @Nan It wouldn't
be a problem
    >> since
    >> >> the
    >> >> >>> >> scala libraries with same major version are compatible.
    >> >> >>> >>
    >> >> >>> >> 2017-08-16 23:49 GMT+08:00 Joern Kottmann <kottmann@gmail.com>:
    >> >> >>> >> > Hello,
    >> >> >>> >> >
    >> >> >>> >> > I personally had quite some issues with Scala
dependencies in
    >> >> >>> >> > different versions and Spark, where one version
is not
    >> compatible
    >> >> with
    >> >> >>> >> > the other version. Then you need to debug
the dependency tree
    >> to
    >> >> find
    >> >> >>> >> > the places where the versions don't match.
Every project which
    >> >> would
    >> >> >>> >> > like to use MXnet then has to depend on Scala
and might also
    >> get
    >> >> >>> >> > conflicts if other dependencies depend on
different Scala
    >> >> versions.
    >> >> >>> >> > Probably something which will cause issues
for some of your
    >> users.
    >> >> >>> >> > Users who want to use Java might not be familiar
with Scala
    >> >> dependency
    >> >> >>> >> > problems and have a hard time resolving them
by getting strange
    >> >> error
    >> >> >>> >> > messages.
    >> >> >>> >> >
    >> >> >>> >> > The JNI layer could be generated with JavaCPP,
then we would
    >> not
    >> >> need
    >> >> >>> >> > to write/maintain the C and the  jvm side
for that our self.
    >> >> >>> >> > A good example of JavaCPP and Scala usage
is Apache Mahout [1].
    >> >> >>> >> >
    >> >> >>> >> > Even if we don't use JavaCPP, the JNI layer
should be easy to
    >> get
    >> >> into
    >> >> >>> >> > a state where both can share it, the current
Scala JNI layers
    >> >> LibInfo
    >> >> >>> >> > classes could be converted to Java classes
and would in most
    >> cases
    >> >> >>> >> > require only minor changes in the Scala code.
    >> >> >>> >> >
    >> >> >>> >> > Jörn
    >> >> >>> >> >
    >> >> >>> >> > [1] https://github.com/apache/mahout/tree/master/viennacl/
    >> >> src/main
    >> >> >>> >> >
    >> >> >>> >> > On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu
<
    >> zhunanmcgill@gmail.com>
    >> >> >>> wrote:
    >> >> >>> >> >> I agree with Yizhi
    >> >> >>> >> >>
    >> >> >>> >> >> My major concern is the duplicate implementations,
which are
    >> >> usually
    >> >> >>> >> one of
    >> >> >>> >> >> the major sources of bugs, especially
with two languages which
    >> >> are
    >> >> >>> >> >> naturally interactive (OK, Calling Scala
from Java might need
    >> >> some
    >> >> >>> more
    >> >> >>> >> >> efforts). It is just like we provide
C++ & C APIs of MxNet in
    >> two
    >> >> >>> >> separated
    >> >> >>> >> >> packages.
    >> >> >>> >> >>
    >> >> >>> >> >> About dependency problem, when you say
"As far as I see this
    >> has
    >> >> the
    >> >> >>> >> great
    >> >> >>> >> >> disadvantage that the Java API would
force Scala as a
    >> dependency
    >> >> onto
    >> >> >>> >> the
    >> >> >>> >> >> java users.", would you please give a
concrete example causing
    >> >> >>> critical
    >> >> >>> >> >> issues?
    >> >> >>> >> >>
    >> >> >>> >> >> Best,
    >> >> >>> >> >>
    >> >> >>> >> >> Nan
    >> >> >>> >> >>
    >> >> >>> >> >>
    >> >> >>> >> >>
    >> >> >>> >> >> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi
Liu <
    >> javelinjs@gmail.com>
    >> >> >>> wrote:
    >> >> >>> >> >>
    >> >> >>> >> >>> Hi,
    >> >> >>> >> >>>
    >> >> >>> >> >>> If we build the Java API from the
very beginning, i.e. the
    >> JNI
    >> >> part,
    >> >> >>> >> >>> we have to rewrite the codes for
training, predict,
    >> inferShape,
    >> >> etc.
    >> >> >>> >> >>> It would be too heavy to maintain
a totally new front
    >> language.
    >> >> >>> >> >>>
    >> >> >>> >> >>> As far as I see, I don't think Scala
library dependency would
    >> >> be a
    >> >> >>> big
    >> >> >>> >> >>> problem in most cases, unless we
are going to use it in
    >> embedded
    >> >> >>> >> >>> devices. Could you illustrate some
use-cases where you cannot
    >> >> >>> involve
    >> >> >>> >> >>> Scala dependencies?
    >> >> >>> >> >>>
    >> >> >>> >> >>> 2017-08-16 22:13 GMT+08:00 Joern
Kottmann <
    >> kottmann@gmail.com>:
    >> >> >>> >> >>> > Hello,
    >> >> >>> >> >>> >
    >> >> >>> >> >>> > the approach which is taken
by Spark is described here [1].
    >> >> >>> >> >>> >
    >> >> >>> >> >>> > As far as I see this has the
great disadvantage that the
    >> Java
    >> >> API
    >> >> >>> >> >>> > would force Scala as a dependency
onto the java users.
    >> >> >>> >> >>> > For a library it is always a
great advantage if it doesn't
    >> >> have
    >> >> >>> many
    >> >> >>> >> >>> > dependencies, or zero dependencies.
In our case it could be
    >> >> quite
    >> >> >>> >> >>> > realistic to have a thin wrapper
around the C API without
    >> >> needing
    >> >> >>> any
    >> >> >>> >> >>> > other dependencies (or only
dependencies which can't be
    >> >> avoided).
    >> >> >>> >> >>> >
    >> >> >>> >> >>> > The JNI layer could easily be
shared between the Java and
    >> >> Scala
    >> >> >>> API.
    >> >> >>> >> >>> > As far as I understand is the
JNI layer in the Scala API
    >> >> anyway
    >> >> >>> >> >>> > private and a change to it wouldn't
require that the public
    >> >> part
    >> >> >>> of
    >> >> >>> >> >>> > the Scala API is changed.
    >> >> >>> >> >>> >
    >> >> >>> >> >>> > What do you think?
    >> >> >>> >> >>> >
    >> >> >>> >> >>> > Jörn
    >> >> >>> >> >>> >
    >> >> >>> >> >>> > [1] https://cwiki.apache.org/
    >> confluence/display/SPARK/Java+
    >> >> >>> >> API+Internals
    >> >> >>> >> >>> >
    >> >> >>> >> >>> > On Wed, Aug 16, 2017 at 3:39
PM, YiZhi Liu <
    >> >> javelinjs@gmail.com>
    >> >> >>> >> wrote:
    >> >> >>> >> >>> >> Hi Joern,
    >> >> >>> >> >>> >>
    >> >> >>> >> >>> >> I suggest to build Java
API as a wrapper of Scala API,
    >> re-use
    >> >> >>> most
    >> >> >>> >> of
    >> >> >>> >> >>> >> the procedures. Referring
to the Java API in Apache Spark.
    >> >> >>> >> >>> >>
    >> >> >>> >> >>> >> 2017-08-16 18:21 GMT+08:00
Joern Kottmann <
    >> joern@apache.org
    >> >> >:
    >> >> >>> >> >>> >>> Hello all,
    >> >> >>> >> >>> >>>
    >> >> >>> >> >>> >>> I would like to propose
the addition of a Java API to
    >> MXNet.
    >> >> >>> >> >>> >>>
    >> >> >>> >> >>> >>> There has been some
previous work done for the Scala API,
    >> >> and it
    >> >> >>> >> makes
    >> >> >>> >> >>> >>> sense to at least share
the JNI layer between the two.
    >> >> >>> >> >>> >>>
    >> >> >>> >> >>> >>> The Java  API probably
should be aligned with the Python
    >> API
    >> >> >>> (and
    >> >> >>> >> >>> >>> others which exist already)
with a few changes to give
    >> it a
    >> >> >>> native
    >> >> >>> >> >>> >>> Java feel.
    >> >> >>> >> >>> >>>
    >> >> >>> >> >>> >>> As far as I understand
there are multiple people
    >> interested
    >> >> to
    >> >> >>> >> work on
    >> >> >>> >> >>> >>> this and it would be
good to maybe come up with a written
    >> >> >>> proposal
    >> >> >>> >> on
    >> >> >>> >> >>> >>> how things should be.
    >> >> >>> >> >>> >>>
    >> >> >>> >> >>> >>> My motivation is to
get a Java API which can be used by
    >> >> Apache
    >> >> >>> >> OpenNLP
    >> >> >>> >> >>> >>> to solve various NLP
tasks using Deep Learning based
    >> >> approaches
    >> >> >>> >> and I
    >> >> >>> >> >>> >>> am also interested to
work on MXNet.
    >> >> >>> >> >>> >>>
    >> >> >>> >> >>> >>> Jörn
    >> >> >>> >> >>> >>
    >> >> >>> >> >>> >>
    >> >> >>> >> >>> >>
    >> >> >>> >> >>> >> --
    >> >> >>> >> >>> >> Yizhi Liu
    >> >> >>> >> >>> >> DMLC member
    >> >> >>> >> >>> >> Technical Manager
    >> >> >>> >> >>> >> Qihoo 360 Inc, Shanghai,
China
    >> >> >>> >> >>>
    >> >> >>> >> >>>
    >> >> >>> >> >>>
    >> >> >>> >> >>> --
    >> >> >>> >> >>> Yizhi Liu
    >> >> >>> >> >>> DMLC member
    >> >> >>> >> >>> Technical Manager
    >> >> >>> >> >>> Qihoo 360 Inc, Shanghai, China
    >> >> >>> >> >>>
    >> >> >>> >>
    >> >> >>> >>
    >> >> >>> >>
    >> >> >>> >> --
    >> >> >>> >> Yizhi Liu
    >> >> >>> >> DMLC member
    >> >> >>> >> Technical Manager
    >> >> >>> >> Qihoo 360 Inc, Shanghai, China
    >> >> >>> >>
    >> >> >>>
    >> >> >>>
    >> >> >>>
    >> >> >>> --
    >> >> >>> Yizhi Liu
    >> >> >>> DMLC member
    >> >> >>> Technical Manager
    >> >> >>> Qihoo 360 Inc, Shanghai, China
    >> >> >>>
    >> >>
    >> >>
    >> >>
    >> >> --
    >> >> Yizhi Liu
    >> >> DMLC member
    >> >> Technical Manager
    >> >> Qihoo 360 Inc, Shanghai, China
    >> >>
    >> >
    >> >
    >> >
    >> > --
    >> > Sandeep Krishnamurthy
    >>
    >>
    >>
    >> --
    >> Yizhi Liu
    >> DMLC member
    >> Technical Manager
    >> Qihoo 360 Inc, Shanghai, China
    >>
    

Mime
View raw message