mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tianqi Chen <tqc...@cs.washington.edu>
Subject Re: Request for suggestions- Supporting onnx in mxnet
Date Thu, 19 Oct 2017 20:26:01 GMT
Again my recommendation is to go through mxnet/gluon (which in that case
core operator set of NNVM) with the following technical reason:

- Enjoy future compatibility and compilation pipeline that other frameworks
do not have
- Articulate Apache MXNet's need of core operators clearly to give
ApacheMXNet's position clear in influencing exchange format design.
- We agreed on the same user-facing API to be in MXNet, and going through
mxnet/gluon(nnvm) does not prevent that from happening.

Thanks for the discussions and to move forward with a decision. We can call
for a vote on among the current committers on this issue.

Tianqi

On Thu, Oct 19, 2017 at 1:17 PM, Lupesko, Hagay <lupesko@gmail.com> wrote:

> This thread is long and forked, but I just wanted to re-iterate my
> proposal to have ONNX import/export implemented in MXNet /contrib as
> experimental.
> I think this is a good first step that hopefully allows MXNet users to
> easily leverage ONNX, but still leave a clear path to update the
> implementation later on if it makes sense.
>
> How do we move forward with a decision?
>
> On 10/19/17, 12:14, "Tianqi Chen" <workcrow@gmail.com on behalf of
> tqchen@cs.washington.edu> wrote:
>
>     Hi Hen:
>
>     It is sad to think DMLC adversarially in this matter.  DMLC projects
> adopt
>     apache way of doing things and we are planning moving more modules into
>     Apache.
>
>     All the discussion so far happens under the Apache manner and I do
> think
>     that healthy discussion on critical design issues is important. It is
>     unfair to say something is rotten just when there is a debate going on
> in
>     terms of technical issues.
>
>     They are merely based on our technical assessment of what is better
> for the
>     project in general, instead of being political or chanting the detailed
>     credits or ownership of the code.
>
>
>     Tianqi
>
>     On Thu, Oct 19, 2017 at 12:03 PM, Hen <bayard@apache.org> wrote:
>
>     > What I think I'm seeing here is that:
>     >
>     > * MXNet moved to Apache.
>     > * Some of the code it relied on (50% per the last release thread,
> but that
>     > may have been bombastic) remained at DMLC.
>     > * The MXNet community thinks one thing.
>     > * The DMLC community (which is a subset of the MXNet community that
> runs
>     > under different community rules) thinks another.
>     >
>     > Something is rotten.
>     >
>     > One solution: The MXNet community forks the DMLC code it relies on
> into the
>     > MXNet codebase and moves on without being tied down by the decisions
> of a
>     > non-compatible community.
>     >
>     > Hen
>     >
>     >
>     >
>     > On Thu, Oct 19, 2017 at 11:59 AM, Tianqi Chen <
> tqchen@cs.washington.edu>
>     > wrote:
>     >
>     > > Here are the detailed points(sorry for resenting it over again)
>     > >
>     > > Technical Reasoning:
>     > >
>     > >  - Model exchange format like CoreML and ONNX are not lossless and
>     > > complete. They are designed to an contain a core set of the
>     > > minimum operators to support necessary inference tasks like
> ResNet, etc.
>     > > So you cannot rely on a bi-directional serialization with this
> format for
>     > > all MXNet models.  As a simple example, broadcast add/mul is
> simply not
>     > > supported in onnx.
>     > >
>     > > - Same problem goes for compilation and in-memory IR, a core set
> of most
>     > > interesting primitives are effectively supported.
>     > >
>     > > - Either in the case of supporting exchange format, or in-memory
> IR, we
>     > > need to make the decision on what core set of operators are we
> interested
>     > > in support.  We cannot simply say let us support everything from
> the
>     > > beginning due to the limitations of the exchange format.
>     > >
>     > > - It is crucial for us articulate what is the core set of
> operators we
>     > care
>     > > about in MXNet. Either in terms of providing guidelines to the
> community,
>     > > or influence the design of model exchange format them-selfs to
> move in
>     > > favor of MXNet.
>     > >
>     > > - nnvm/top is that initial core set of operators for both compiler
>     > support
>     > > and exchange purposes. It is modeled under numpy and gluon, under
> the
>     > > supervision of Eric, Me and Mu.  It can be bi-directionally
> exchanged
>     > with
>     > > a current mxnet operator without loss of information.
>     > >
>     > > The Effort of Engineering:
>     > >
>     > > - Because nnvm/top is modeled with numpy and gluon, mxnet<->
> nnvm/top is
>     > > quite easy, and we already have one direction done. I would be
> very happy
>     > > to answer any questions on another. No information loss will
> happen with
>     > > this path.
>     > >
>     > > - mxnet/symbol or nnvm/symbol(they are essentially the same thing
> with a
>     > > bit different op defs) <- onnx is harder. There has been already
> enough
>     > > effort to support onnx 0.1 as Roshani mentioned. Which is
> contributed by
>     > > Zhi Zhang, another Apache MXNet committer. Zhi already provided
> code to
>     > > alleviate this process. Built code on the existing effort would
> actually
>     > > make the problem easier.
>     > >
>     > > On Thu, Oct 19, 2017 at 11:55 AM, Tianqi Chen <
> tqchen@cs.washington.edu>
>     > > wrote:
>     > >
>     > > > As for where the code should sit, we have seen onnx's support for
>     > caffe2
>     > > > sitting on a separate repo.  My suggestion would be put code
> under
>     > > nnvm/top
>     > > > and migrate into mxnet eventually when the top components get
> into
>     > MXNet,
>     > > > hopefully by end of next month.
>     > > >
>     > > > I have elaborated my point in the last email thread. This (going
>     > through
>     > > > nnvm/top) is an important design decision both technically
>     > (compilation,
>     > > > more hardware) and strategically (articulate our core set of
> operators
>     > > and
>     > > > influence the model exchange format).
>     > > >
>     > > > I am glad to see the discussion happening and surely there is
> doubt, as
>     > > > with every big step of changes.  But with the rapidly changing
> pace of
>     > > deep
>     > > > learning systems, this is the direction that we thought is most
>     > > promising.
>     > > > We can call for a vote if necessary among the committers for the
> design
>     > > > decision if there is still debate on this issue. Or we can keep
> the
>     > > > discussion open and start some effort around nnvm/top to see how
> it
>     > goes
>     > > >
>     > > > Tianqi
>     > > >
>     > > > On Thu, Oct 19, 2017 at 11:15 AM, Lupesko, Hagay <
> lupesko@gmail.com>
>     > > > wrote:
>     > > >
>     > > >> Mu,
>     > > >>
>     > > >> You’re mentioning plans for a new model format and compiler,
> but I
>     > don’t
>     > > >> recall seeing it shared/discussed on the dev list. Can you share
>     > these,
>     > > so
>     > > >> it is more accessible to folks to understand the plan and
> vision?
>     > > >>
>     > > >> Personally, I think it will be a shame to add ONNX support to
> MXNet,
>     > and
>     > > >> have it implemented outside of MXNet. At the end of the day, it
> makes
>     > > >> things difficult for MXNet users.
>     > > >>
>     > > >> Hagay
>     > > >>
>     > > >> On 10/19/17, 10:01, "Mu Li" <limu.cn@gmail.com on behalf of
>     > > >> muli.cmu@gmail.com> wrote:
>     > > >>
>     > > >>     I'm speaking under my "MXNet contributor" hat.
>     > > >>
>     > > >>     It will be sad that our new model format and compiler is not
>     > > >> supported by
>     > > >>     our own contributors. It puts us in a bad position to reach
> out to
>     > > >> outside
>     > > >>     to ask for support.
>     > > >>
>     > > >>     If you really what to do it with the onnx <-> mxnet way, I
> suggest
>     > > >> putting
>     > > >>     the codes under https://github.com/aws.
>     > > >>
>     > > >>     Best
>     > > >>     Mu
>     > > >>
>     > > >>     On Thu, Oct 19, 2017 at 9:51 AM, Lupesko, Hagay <
>     > lupesko@gmail.com>
>     > > >> wrote:
>     > > >>
>     > > >>     > Since there seems to be a difficulty to reach a consensus
> here,
>     > > and
>     > > >> this
>     > > >>     > is a new area, maybe a good compromise would be to
> contribute
>     > this
>     > > >> under
>     > > >>     > /contrib as experimental, with whatever way Roshani
> thinks makes
>     > > >> sense.
>     > > >>     > Once there is code in place, and MXNet users and
> contributors
>     > are
>     > > >> able to
>     > > >>     > check it out, we can consider future steps.
>     > > >>     >
>     > > >>     > Does this proposal make sense to folks?
>     > > >>     >
>     > > >>     > On 10/18/17, 23:01, "Tianqi Chen" <workcrow@gmail.com on
> behalf
>     > > of
>     > > >>     > tqchen@cs.washington.edu> wrote:
>     > > >>     >
>     > > >>     >     I want to offer one last thing in terms of technical
>     > details.
>     > > I
>     > > >>     > mentioned
>     > > >>     >     two trends in the deep learning systems. There is one
> last
>     > > >> thing that
>     > > >>     > is
>     > > >>     >     omitted. How should we build a good deploy end for
> deep
>     > > learning
>     > > >>     > models.
>     > > >>     >
>     > > >>     >     There is always a paradox to this problem:
>     > > >>     >
>     > > >>     >     - On one hand, the deployment end needs to be
> lightweight
>     > and
>     > > >> portable.
>     > > >>     >     - We want a lot of optimizations (memory layout
> compute) and
>     > > >> feature
>     > > >>     >     support, this makes the project big.
>     > > >>     >
>     > > >>     >     All the existing systems suffer from this problem. The
>     > > solution
>     > > >> is
>     > > >>     > simple,
>     > > >>     >     separates the optimization part from the actual
> runtime and
>     > > >> compiles
>     > > >>     > the
>     > > >>     >     things down to a bare metal module. And this is the
> solution
>     > > >> nnvm/top
>     > > >>     >     compiler pipeline offer, which I believe will become a
>     > > standard
>     > > >>     > practice of
>     > > >>     >     deployment and where all systems go to
>     > > >>     >
>     > > >>     >     Tianqi
>     > > >>     >
>     > > >>     >     On Wed, Oct 18, 2017 at 10:03 PM, Tianqi Chen <
>     > > >>     > tqchen@cs.washington.edu>
>     > > >>     >     wrote:
>     > > >>     >
>     > > >>     >     > OK, there is some miscommunication in here I
> guess.  We
>     > only
>     > > >> need to
>     > > >>     > do a
>     > > >>     >     > "canonization" step in python API that goes a
> symbol to
>     > > symbol
>     > > >>     > translation
>     > > >>     >     > layer. It can be done in purely in python, and
> there is no
>     > > >> need for
>     > > >>     > going
>     > > >>     >     > "down" into c++ to do this.
>     > > >>     >     >
>     > > >>     >     > For example, the current nnvm.from_mxnet API takes
> Module
>     > or
>     > > >> Gluon
>     > > >>     > module
>     > > >>     >     > and get you back nnvm/top graph in python.
>     > > >>     >     >
>     > > >>     >     > All we are asking for is to decomposing it into
>     > > >>     >     >
>     > > >>     >     > def mxnet_to_onnx(module):
>     > > >>     >     >    nnvm_graph, params = nnvm_from_mxnet(module)
>     > > >>     >     >    onnx = nnvm_to_onnx(nnvm_graph, params)
>     > > >>     >     >    return onnx
>     > > >>     >     >
>     > > >>     >     > This allows nnvm_from_mxnet to be reused for other
>     > purposes,
>     > > >> like
>     > > >>     >     > compiling API to deployable modules
>     > > >>     >     >
>     > > >>     >     > Tianqi
>     > > >>     >     >
>     > > >>     >     > On Wed, Oct 18, 2017 at 9:55 PM, Lupesko, Hagay <
>     > > >> lupesko@gmail.com>
>     > > >>     > wrote:
>     > > >>     >     >
>     > > >>     >     >> Tianqi:
>     > > >>     >     >> Thanks for detailing the trends. I fully agree
> that ONNX
>     > is
>     > > >> just a
>     > > >>     > graph
>     > > >>     >     >> serialization format – nothing more, nothing less.
> I also
>     > > >> think we
>     > > >>     > all
>     > > >>     >     >> agree that this simple mechanism holds lots of
> value to
>     > DL
>     > > >> users
>     > > >>     > since it
>     > > >>     >     >> allows them to move between frameworks easily
> (e.g. train
>     > > >> with
>     > > >>     > MXNet,
>     > > >>     >     >> deploy on a mobile device with Caffe2, or the
> other way
>     > > >> around).
>     > > >>     >     >> As you said, In Memory IR is different than
> serialization
>     > > >> formats
>     > > >>     > such as
>     > > >>     >     >> ONNX. They are designed to make the runtime
> execution as
>     > > >> efficient
>     > > >>     > as
>     > > >>     >     >> possible, leveraging software and hardware
> optimizations.
>     > > >> They are
>     > > >>     > indeed
>     > > >>     >     >> complex, and where the “meat” is.
>     > > >>     >     >> (BTW ONNX regards itself as an “IR” format, but
> not in
>     > the
>     > > >> same
>     > > >>     > sense as
>     > > >>     >     >> NNVM).
>     > > >>     >     >>
>     > > >>     >     >> At the end of the day, Roshani is aiming to
> deliver a
>     > > simple
>     > > >>     >     >> functionality to MXNet users: (1) take an ONNX
> file, and
>     > > >> load it
>     > > >>     > into MXNet
>     > > >>     >     >> so you get a graph+weights you can work with (2)
> Given a
>     > > >> trained
>     > > >>     > model,
>     > > >>     >     >> save it as an ONNX file. Since MXNet users do not
>     > interact
>     > > >> with NNVM
>     > > >>     >     >> directly, but rather interact with MXNet API (MXNet
>     > > Module),
>     > > >> isn’t
>     > > >>     > the
>     > > >>     >     >> simplest thing to do is just to construct the
> Module “on
>     > > the
>     > > >> fly”
>     > > >>     > using
>     > > >>     >     >> MXNet API? Taking the other approach, we will go
> from the
>     > > >> top level
>     > > >>     > MXNet
>     > > >>     >     >> “load” API, go “down” to NNVM to construct the
> graph, go
>     > > >> back up to
>     > > >>     > MXNet
>     > > >>     >     >> to expose it as a Module. This seems to complex
> and does
>     > > not
>     > > >> add any
>     > > >>     >     >> benefit. In whatever way we construct the MXNet
> Module
>     > > >> object, NNVM
>     > > >>     > will
>     > > >>     >     >> always be the underlying in memory IR that is being
>     > > >> executed, so
>     > > >>     > why not
>     > > >>     >     >> take the simpler route?
>     > > >>     >     >>
>     > > >>     >     >> Hagay
>     > > >>     >     >>
>     > > >>     >     >> On 10/18/17, 19:42, "Tianqi Chen" <
> workcrow@gmail.com on
>     > > >> behalf of
>     > > >>     >     >> tqchen@cs.washington.edu> wrote:
>     > > >>     >     >>
>     > > >>     >     >>     Hi Chris:
>     > > >>     >     >>
>     > > >>     >     >>     There is no intention to move things away from
> mxnet.
>     > > The
>     > > >>     > reduction of
>     > > >>     >     >>     lines of code by having a better design in
> general,
>     > and
>     > > >>     > usually, you
>     > > >>     >     >> write
>     > > >>     >     >>     less redundant code by benefiting from better
> design.
>     > > As
>     > > >> I may
>     > > >>     > quote:
>     > > >>     >     >> "the
>     > > >>     >     >>     best design is not achieved not when there is
> nothing
>     > > to
>     > > >> add,
>     > > >>     > but when
>     > > >>     >     >>     there is nothing to be taken away."
>     > > >>     >     >>
>     > > >>     >     >>     MXNet has always benefited from this
> philosophy and
>     > > >> improves
>     > > >>     > with the
>     > > >>     >     >> new
>     > > >>     >     >>     designs and proper modularization. For
> example, we
>     > see
>     > > >> such
>     > > >>     > reduction
>     > > >>     >     >> and
>     > > >>     >     >>     convenience happening when migrating from
> MXNet's
>     > > legacy
>     > > >> op to
>     > > >>     > the
>     > > >>     >     >>     NNVM's mechanism. The new mechanism now enables
>     > things
>     > > >> like
>     > > >>     > sparse
>     > > >>     >     >> aware
>     > > >>     >     >>     support and other stuff which would be much
> harder to
>     > > >> support.
>     > > >>     >     >>
>     > > >>     >     >>     The nnvm/tvm stack comes brings the same
> benefit(if
>     > not
>     > > >> more)
>     > > >>     > and it
>     > > >>     >     >> will
>     > > >>     >     >>     only add more features to MXNet itself.
> Offering more
>     > > >> hardware
>     > > >>     >     >> backends and
>     > > >>     >     >>     optimization, allowing us to write less code
> and
>     > spent
>     > > >> less
>     > > >>     > time to
>     > > >>     >     >>     optimize for each backend by going through TVM
>     > > >>     >     >>
>     > > >>     >     >>     Tianqi
>     > > >>     >     >>
>     > > >>     >     >>     On Wed, Oct 18, 2017 at 7:15 PM, Chris Olivier
> <
>     > > >>     > cjolivier01@gmail.com
>     > > >>     >     >> >
>     > > >>     >     >>     wrote:
>     > > >>     >     >>
>     > > >>     >     >>     > Reduce code base of mxnet? By increasing
> scope of
>     > the
>     > > >> dmlc
>     > > >>     > modules?
>     > > >>     >     >> Is the
>     > > >>     >     >>     > intent to make mxnet a thin language wrapper
>     > around a
>     > > >> group
>     > > >>     > of dmlc
>     > > >>     >     >>     > modules?
>     > > >>     >     >>     >
>     > > >>     >     >>     >
>     > > >>     >     >>     > On Wed, Oct 18, 2017 at 6:58 PM Tianqi Chen <
>     > > >>     >     >> tqchen@cs.washington.edu>
>     > > >>     >     >>     > wrote:
>     > > >>     >     >>     >
>     > > >>     >     >>     > > To better answer Hagay's question, I would
> like
>     > to
>     > > >> dive
>     > > >>     > down a
>     > > >>     >     >> bit deeper
>     > > >>     >     >>     > > on the relation between MXNet, NNVM and
> model
>     > > >> exchange
>     > > >>     > format
>     > > >>     >     >> like ONNX.
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > There are two major trends in deep learning
>     > systems
>     > > >> now:
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > - Common serializable formats, like ONNX
> and
>     > > CoreML,
>     > > >> that
>     > > >>     > defines
>     > > >>     >     >> the
>     > > >>     >     >>     > model
>     > > >>     >     >>     > > exchange format.
>     > > >>     >     >>     > > - The in-memory graph IR for quick
> optimization
>     > and
>     > > >> JIT.
>     > > >>     > NNVM,
>     > > >>     >     >>     > Tensorflow's
>     > > >>     >     >>     > > XLA falls into this category.
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > The exchange formats are great, it only
> poses a
>     > > >> layer of
>     > > >>     >     >> conversion,
>     > > >>     >     >>     > which
>     > > >>     >     >>     > > is good for exchange. The real meat still
> comes
>     > > from
>     > > >> the
>     > > >>     >     >> compilation and
>     > > >>     >     >>     > > JIT pipeline you have to offer. For that,
> we will
>     > > >> need an
>     > > >>     >     >> in-memory IR,
>     > > >>     >     >>     > > because of the cost of constructing,
> serialize
>     > > could
>     > > >> be
>     > > >>     > high for
>     > > >>     >     >> the
>     > > >>     >     >>     > > exchange formats like protobuf.  And
> usually, the
>     > > >> exchange
>     > > >>     >     >> formats are
>     > > >>     >     >>     > > designed in a minimalistic fashion, making
> it
>     > less
>     > > >> easy to
>     > > >>     > extend
>     > > >>     >     >> more
>     > > >>     >     >>     > > information to support in-depth
> optimization like
>     > > >> automatic
>     > > >>     >     >> quantization,
>     > > >>     >     >>     > > accelerator support.
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > The current MXNet relies on NNVM for
> in-memory IR
>     > > >>     > manipulation
>     > > >>     >     >> but does
>     > > >>     >     >>     > not
>     > > >>     >     >>     > > contain a compilation component that
> compiles to
>     > > the
>     > > >>     > hardware
>     > > >>     >     >> backends.
>     > > >>     >     >>     > > Doing export to an exchange format and
> then back
>     > > >> into NNVM
>     > > >>     > run the
>     > > >>     >     >>     > > compilation poses too much burden that JIT
>     > compiler
>     > > >> could
>     > > >>     > pay.
>     > > >>     >     >> Using the
>     > > >>     >     >>     > > same in-memory graph IR as the compilation
> stack
>     > > >> give much
>     > > >>     > more
>     > > >>     >     >> advantage
>     > > >>     >     >>     > > in terms of this.
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > The newly introduces nnvm/top and compiler
> offers
>     > > >> in-memory
>     > > >>     > graph
>     > > >>     >     >>     > > optimization and compilation and offers
> more
>     > > hardware
>     > > >>     > backend
>     > > >>     >     >> directly
>     > > >>     >     >>     > via
>     > > >>     >     >>     > > TVM. We already see promising results in
> edge
>     > > >> deployments
>     > > >>     > with a
>     > > >>     >     >> much
>     > > >>     >     >>     > lower
>     > > >>     >     >>     > > overhead of runtime. We will further
> benefit
>     > > quickly
>     > > >> from
>     > > >>     > more
>     > > >>     >     >> graph
>     > > >>     >     >>     > > optimizations that it has to offer.
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > Building support around this new paradigm
> offers
>     > us
>     > > >>     > advantage of
>     > > >>     >     >> being
>     > > >>     >     >>     > > future compatible and takes full benefit
> of the
>     > > >> points I
>     > > >>     >     >> mentioned above
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > Tianqi
>     > > >>     >     >>     > >
>     > > >>     >     >>     > >
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > On Wed, Oct 18, 2017 at 4:57 PM, Lupesko,
> Hagay <
>     > > >>     >     >> lupesko@gmail.com>
>     > > >>     >     >>     > wrote:
>     > > >>     >     >>     > >
>     > > >>     >     >>     > > > Roshani – this is an exciting
> initiative, ONNX
>     > > >> support on
>     > > >>     > MXNet
>     > > >>     >     >> will
>     > > >>     >     >>     > > > enable more users to ramp up on MXNet,
> which is
>     > > >> great.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > > Tianqi – a few questions and thoughts
> about
>     > your
>     > > >> note:
>     > > >>     >     >>     > > > - “More hardware backends to mxnet” –
> MXNet
>     > users
>     > > >> get the
>     > > >>     > same
>     > > >>     >     >> benefit
>     > > >>     >     >>     > of
>     > > >>     >     >>     > > > HW support implementing ONNX import on
> top of
>     > > MXNet
>     > > >>     > symbolic,
>     > > >>     >     >> right?
>     > > >>     >     >>     > > > - “NNVM Compiler now received
> contributions
>     > from
>     > > >> AWS, UW
>     > > >>     > and
>     > > >>     >     >> many other
>     > > >>     >     >>     > > > folks in MXNet community.” – agreed it is
>     > ramping
>     > > >> up, but
>     > > >>     > when
>     > > >>     >     >> you look
>     > > >>     >     >>     > > at
>     > > >>     >     >>     > > > the data, it is clear that it is very
> early on
>     > > for
>     > > >> NNVM.
>     > > >>     >     >> Looking at the
>     > > >>     >     >>     > > > repo, it has overall 223 commits, 0
> releases.
>     > > >> Compare it
>     > > >>     > to
>     > > >>     >     >> MXNet with
>     > > >>     >     >>     > > 6136
>     > > >>     >     >>     > > > commits and 32 releases. It seems to be
> still
>     > > >> early on for
>     > > >>     >     >> NNVM, and
>     > > >>     >     >>     > for
>     > > >>     >     >>     > > a
>     > > >>     >     >>     > > > more reliable initial implementation
> building
>     > the
>     > > >> import
>     > > >>     > on top
>     > > >>     >     >> of
>     > > >>     >     >>     > MXNet
>     > > >>     >     >>     > > is
>     > > >>     >     >>     > > > easier, faster and safer. MXNet has lots
> of
>     > users
>     > > >> already
>     > > >>     > using
>     > > >>     >     >> the
>     > > >>     >     >>     > > > Symbolic API which hopefully mean that
> is a
>     > > mature
>     > > >> API
>     > > >>     > that is
>     > > >>     >     >> not
>     > > >>     >     >>     > likely
>     > > >>     >     >>     > > > to have breaking changes or major issues.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > > I’m supportive option 1 proposed by
> Roshani
>     > > >> (building
>     > > >>     > serde on
>     > > >>     >     >> top of
>     > > >>     >     >>     > > > MXNet symbolic), but to do it as an
>     > encapsulated
>     > > >>     > implementation
>     > > >>     >     >> detail,
>     > > >>     >     >>     > > so
>     > > >>     >     >>     > > > the implementation can be migrated to
> NNVM or
>     > > >> another
>     > > >>     >     >> implementation in
>     > > >>     >     >>     > > the
>     > > >>     >     >>     > > > future, if at that point it seems like
> the
>     > right
>     > > >> thing to
>     > > >>     > do.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > > Interested in hearing other opinions
> though…
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > > Hagay
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > > On 10/18/17, 14:13, "Tianqi Chen" <
>     > > >> workcrow@gmail.com on
>     > > >>     >     >> behalf of
>     > > >>     >     >>     > > > tqchen@cs.washington.edu> wrote:
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     I am strongly recommending going
> through
>     > the
>     > > >>     > nnvm/top. One
>     > > >>     >     >> major
>     > > >>     >     >>     > > > reason in
>     > > >>     >     >>     > > >     here is that the support of nnvm/top
> layer
>     > > NOT
>     > > >> ONLY
>     > > >>     > mean
>     > > >>     >     >>     > > compatibility
>     > > >>     >     >>     > > > of
>     > > >>     >     >>     > > >     model format with onnx. These are
> the major
>     > > >> benefits:
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     - More hardware backends to mxnet,
>     > including
>     > > >> opencl,
>     > > >>     > metal,
>     > > >>     >     >>     > Raspberry
>     > > >>     >     >>     > > > Pi,
>     > > >>     >     >>     > > >     web browser. These things are
> automatically
>     > > >> enabled
>     > > >>     > by going
>     > > >>     >     >>     > through
>     > > >>     >     >>     > > > this
>     > > >>     >     >>     > > >     layer. In general, we design
> nnvm/tvm stack
>     > > to
>     > > >>     > resolve the
>     > > >>     >     >>     > challenge
>     > > >>     >     >>     > > of
>     > > >>     >     >>     > > >     current mxnet's weakness in terms
> deploying
>     > > to
>     > > >> more
>     > > >>     > hardware
>     > > >>     >     >>     > > backends.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     - More frontend capabilities, nnvm's
> gluon
>     > > >> style IR
>     > > >>     > ingests
>     > > >>     >     >> now
>     > > >>     >     >>     > from
>     > > >>     >     >>     > > >     CoreML, ONNX and in future keras.
>     > Supporting
>     > > >> those
>     > > >>     > will
>     > > >>     >     >> reduce the
>     > > >>     >     >>     > > > amount
>     > > >>     >     >>     > > >     of engineering effort needed.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     - Future compatibility. We all agree
> that
>     > the
>     > > >> future
>     > > >>     > being
>     > > >>     >     >> migrated
>     > > >>     >     >>     > > to
>     > > >>     >     >>     > > >     gluon's API. NNVM/top tries to look
> ahead
>     > by
>     > > >> directly
>     > > >>     >     >> adopting the
>     > > >>     >     >>     > > > symbolic
>     > > >>     >     >>     > > >     API to be gluon.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     I would also like to correct some of
> the
>     > > >> mentioned
>     > > >>     > facts
>     > > >>     >     >> with
>     > > >>     >     >>     > regard
>     > > >>     >     >>     > > to
>     > > >>     >     >>     > > >     nnvm/tvm stack
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     1.   Nascent project with few
> contributors
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     NNVM Compiler now received
> contributions
>     > from
>     > > >> AWS, UW
>     > > >>     > and
>     > > >>     >     >> many
>     > > >>     >     >>     > other
>     > > >>     >     >>     > > > folks
>     > > >>     >     >>     > > >     in MXNet community. NNVM itself is
> already
>     > > >> being used
>     > > >>     > by
>     > > >>     >     >> MXNet.
>     > > >>     >     >>     > > >     MXNet's internal IR is migrating
> toward
>     > > gluon,
>     > > >> and its
>     > > >>     >     >> final form
>     > > >>     >     >>     > > being
>     > > >>     >     >>     > > >     nnvm/top
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     3.   Does not support all operators
> that
>     > > exist
>     > > >> in
>     > > >>     > MXNet
>     > > >>     >     >> Symbolic
>     > > >>     >     >>     > API
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     Neither NNVM/top or onnx support all
>     > > operators
>     > > >> that
>     > > >>     > exist
>     > > >>     >     >> in mxnet
>     > > >>     >     >>     > > > symbolic
>     > > >>     >     >>     > > >     API. The end goal here is mainly to
> make
>     > > >> nnvm/top onnx
>     > > >>     >     >> compatible,
>     > > >>     >     >>     > > > which is
>     > > >>     >     >>     > > >     a more reasonable goal.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     4.  No CI Pipeline and testcases
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     NNVM already contains a compiler
> contains
>     > > >> unittests
>     > > >>     > and ci
>     > > >>     >     >> tested
>     > > >>     >     >>     > > with
>     > > >>     >     >>     > > >     integration
> https://github.com/dmlc/nnvm,
>     > > >> with a CI
>     > > >>     >     >> pipline that
>     > > >>     >     >>     > is
>     > > >>     >     >>     > > > well
>     > > >>     >     >>     > > >     tested on CPU and GPU cases for
> front-ends.
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     Tianqi
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     On Wed, Oct 18, 2017 at 1:41 PM,
> Roshani
>     > > >> Nagmote <
>     > > >>     >     >>     > > > roshaninagmote2@gmail.com>
>     > > >>     >     >>     > > >     wrote:
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     > Hi guys,
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > I am working on supporting ONNX <
>     > > >>     >     >> https://github.com/onnx/onnx>
>     > > >>     >     >>     > > > pre-trained
>     > > >>     >     >>     > > >     > models in Apache MXNet and would
> like to
>     > > >> seek your
>     > > >>     >     >> opinion on the
>     > > >>     >     >>     > > > choice of
>     > > >>     >     >>     > > >     > implementation. I also have
> created a
>     > > GitHub
>     > > >> issue
>     > > >>     >     >>     > > >     > <https://github.com/apache/
>     > > >>     > incubator-mxnet/issues/8319>.
>     > > >>     >     >>     > > Supporting
>     > > >>     >     >>     > > > ONNX
>     > > >>     >     >>     > > >     > in
>     > > >>     >     >>     > > >     > MXNet will enable users to move
> between
>     > > >> frameworks
>     > > >>     > with
>     > > >>     >     >> their
>     > > >>     >     >>     > > > models, this
>     > > >>     >     >>     > > >     > will also enable MXNet project to
> be a
>     > part
>     > > >> of the
>     > > >>     > ONNX
>     > > >>     >     >> open
>     > > >>     >     >>     > > > standard and
>     > > >>     >     >>     > > >     > steer the direction of ONNX.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > For those who don’t know ONNX,
> ONNX is an
>     > > >> open
>     > > >>     > source
>     > > >>     >     >> format for
>     > > >>     >     >>     > AI
>     > > >>     >     >>     > > > models
>     > > >>     >     >>     > > >     > which enables models to be
> transferred
>     > > >> between
>     > > >>     >     >> frameworks. Refer
>     > > >>     >     >>     > to
>     > > >>     >     >>     > > >     > https://github.com/onnx/onnx for
> more
>     > > >> details.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > To implement the import/export
>     > > functionality
>     > > >> in
>     > > >>     > MXNet, I
>     > > >>     >     >> propose
>     > > >>     >     >>     > to
>     > > >>     >     >>     > > > expose
>     > > >>     >     >>     > > >     > a MXNet python module “serde”(name
> taken
>     > > from
>     > > >>     > Apache Hive
>     > > >>     >     >>     > project)
>     > > >>     >     >>     > > > with the
>     > > >>     >     >>     > > >     > following methods supporting
> different
>     > > >> formats:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > sym, params =
>     > > mxnet.serde.import(other_forma
>     > > >> t_file,
>     > > >>     >     >>     > > > other_format=‘onnx’)
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > other_format_file =
>     > > >> mxnet.serde.export(mxnet_sym,
>     > > >>     >     >> mxnet_params,
>     > > >>     >     >>     > > > ‘onnx’)
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > The implementation under the hood
> can be
>     > > >> done in
>     > > >>     > two ways:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > 1) Implement at the MXNet layer by
>     > parsing
>     > > >> the ONNX
>     > > >>     >     >> model(in
>     > > >>     >     >>     > > protobuf
>     > > >>     >     >>     > > >     > format) and turn into MXNet
> Symbolic
>     > > >> operators and
>     > > >>     > build
>     > > >>     >     >> MXNet
>     > > >>     >     >>     > > model
>     > > >>     >     >>     > > >     > directly. Similarly, I can convert
> the
>     > > MXNet
>     > > >> model
>     > > >>     > to
>     > > >>     >     >> ONNX format
>     > > >>     >     >>     > > at
>     > > >>     >     >>     > > > this
>     > > >>     >     >>     > > >     > layer.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > 2) The DMLC community has released
> the
>     > > >> nnvm/tvm
>     > > >>     > complier
>     > > >>     >     >> and an
>     > > >>     >     >>     > > >     > intermediate representation of the
>     > models,
>     > > >> refer:
>     > > >>     >     >>     > > >     > http://www.tvmlang.org/2017/
>     > > >>     > 10/06/nnvm/tvm-compiler-
>     > > >>     >     >>     > > > announcement.html
>     > > >>     >     >>     > > >     > <http://www.tvmlang.org/2017/1
>     > > >> 0/06/nnvm-compiler-
>     > > >>     >     >>     > announcement.html
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Based on the conversation on the
> GitHub
>     > > issue
>     > > >>     >     >>     > > >     > <https://github.com/apache/
>     > > >>     > incubator-mxnet/issues/8319> I
>     > > >>     >     >>     > opened,
>     > > >>     >     >>     > > Mu
>     > > >>     >     >>     > > >     > mentioned that MXNet would use
> nnvm/tvm
>     > as
>     > > >> the
>     > > >>     > backend in
>     > > >>     >     >> the
>     > > >>     >     >>     > > future.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > We could hook into this layer to
>     > implement
>     > > >> the
>     > > >>     >     >> import/export
>     > > >>     >     >>     > > > functionality.
>     > > >>     >     >>     > > >     > nnvm/tvm has ONNX 0.1 version
> import
>     > > >> implemented.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > For import,
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    I will need to enhance
> nnvm/tvm’s
>     > > >> importer to
>     > > >>     > support
>     > > >>     >     >> ONNX 0.2
>     > > >>     >     >>     > > >     >    2.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Implement nnvm/tvm->mxnet
> symbolic
>     > > >> operators.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > For export:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    mxnet->nnvm/tvm ( nnvm/tvm
> provides
>     > this
>     > > >>     > implementation
>     > > >>     >     >>     > already)
>     > > >>     >     >>     > > >     >    2.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    I will need to Implement
>     > nnvm/tvm>onnx.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > These are the pros and cons I see
> in the
>     > > >> above
>     > > >>     > approaches:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Import/export at mxnet layer
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Pros:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Stable APIs currently used by
> users.
>     > > >>     >     >>     > > >     >    2.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Larger Apache MXNet community of
>     > > >> contributors.
>     > > >>     >     >>     > > >     >    3.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    CI pipeline to catch bugs.
>     > > >>     >     >>     > > >     >    4.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Comparatively less time to
> implement
>     > and
>     > > >> put it
>     > > >>     > in the
>     > > >>     >     >> hands
>     > > >>     >     >>     > of
>     > > >>     >     >>     > > > the
>     > > >>     >     >>     > > >     >    users.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Cons:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    In the future we may have to
>     > reimplement
>     > > >> at the
>     > > >>     >     >> nnvm/tvm
>     > > >>     >     >>     > layer,
>     > > >>     >     >>     > > > in case
>     > > >>     >     >>     > > >     >    MXNet moves to the nnvm/tvm
>     > > >> backend(assuming it
>     > > >>     > will
>     > > >>     >     >> move).
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Import/export at nnvm/tvm layer
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Pros:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Less engineering work in case
> mxnet
>     > > moves
>     > > >> to
>     > > >>     > nnvm/tvm
>     > > >>     >     >>     > > >     >    2.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    nnvm/tvm would become a hub to
> convert
>     > > to
>     > > >>     > different
>     > > >>     >     >> formats.
>     > > >>     >     >>     > > >     >    3.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    nnvm operators are more in
> parity with
>     > > >> mxnet’s
>     > > >>     > gluon
>     > > >>     >     >> APIs this
>     > > >>     >     >>     > > > could be
>     > > >>     >     >>     > > >     >    useful in case Gluon becomes
> the only
>     > > >> standard
>     > > >>     > that
>     > > >>     >     >> MXNet will
>     > > >>     >     >>     > > > support.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Cons:
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    1.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Nascent project with few
> contributors
>     > > >>     >     >>     > > >     >    2.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Does not support all operators
> that
>     > > exist
>     > > >> in
>     > > >>     > MXNet
>     > > >>     >     >> Symbolic
>     > > >>     >     >>     > API
>     > > >>     >     >>     > > >     >    3.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    No CI Pipeline
>     > > >>     >     >>     > > >     >    4.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    Current Apache MXNet project
> does not
>     > > use
>     > > >>     > nnvm/tvm
>     > > >>     >     >> backend
>     > > >>     >     >>     > > >     >    5.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >    mxnet->nnvm/tvm backend needs
> more
>     > > >> testing and
>     > > >>     > user
>     > > >>     >     >> feedback.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Any suggestions on both of these
>     > > approaches?
>     > > >> From
>     > > >>     > user's
>     > > >>     >     >>     > > > perspective, this
>     > > >>     >     >>     > > >     > will be an implementation detail
> that is
>     > > not
>     > > >>     > exposed.
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Thanks,
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >     > Roshani
>     > > >>     >     >>     > > >     >
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > > >
>     > > >>     >     >>     > >
>     > > >>     >     >>     >
>     > > >>     >     >>
>     > > >>     >     >>
>     > > >>     >     >>
>     > > >>     >     >>
>     > > >>     >     >
>     > > >>     >
>     > > >>     >
>     > > >>     >
>     > > >>     >
>     > > >>
>     > > >>
>     > > >>
>     > > >>
>     > > >
>     > >
>     >
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message