mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mu Li <muli....@gmail.com>
Subject Re: Request for suggestions- Supporting onnx in mxnet
Date Thu, 19 Oct 2017 19:53:07 GMT
There is overhead for moving repos to Apache. For example, moving MXNet
took about a few months. Also, we don't have enough evidence to convince
the rest NNVM/TVM contributors that the benefits of moving to Apache are
larger than the cost. So unlikely we can "move all DMCL modules right now".

On Thu, Oct 19, 2017 at 12:43 PM, Chris Olivier <cjolivier01@gmail.com>
wrote:

> Why don't we just move all of these dmlc modules into the Apache repository
> right now and have the correct discussions on dev?  What's the argument
> against this?  IMHO, I thought that's what was going to be done originally.
>
> On Thu, Oct 19, 2017 at 12:14 PM, Tianqi Chen <tqchen@cs.washington.edu>
> wrote:
>
> > Hi Hen:
> >
> > It is sad to think DMLC adversarially in this matter.  DMLC projects
> adopt
> > apache way of doing things and we are planning moving more modules into
> > Apache.
> >
> > All the discussion so far happens under the Apache manner and I do think
> > that healthy discussion on critical design issues is important. It is
> > unfair to say something is rotten just when there is a debate going on in
> > terms of technical issues.
> >
> > They are merely based on our technical assessment of what is better for
> the
> > project in general, instead of being political or chanting the detailed
> > credits or ownership of the code.
> >
> >
> > Tianqi
> >
> > On Thu, Oct 19, 2017 at 12:03 PM, Hen <bayard@apache.org> wrote:
> >
> > > What I think I'm seeing here is that:
> > >
> > > * MXNet moved to Apache.
> > > * Some of the code it relied on (50% per the last release thread, but
> > that
> > > may have been bombastic) remained at DMLC.
> > > * The MXNet community thinks one thing.
> > > * The DMLC community (which is a subset of the MXNet community that
> runs
> > > under different community rules) thinks another.
> > >
> > > Something is rotten.
> > >
> > > One solution: The MXNet community forks the DMLC code it relies on into
> > the
> > > MXNet codebase and moves on without being tied down by the decisions
> of a
> > > non-compatible community.
> > >
> > > Hen
> > >
> > >
> > >
> > > On Thu, Oct 19, 2017 at 11:59 AM, Tianqi Chen <
> tqchen@cs.washington.edu>
> > > wrote:
> > >
> > > > Here are the detailed points(sorry for resenting it over again)
> > > >
> > > > Technical Reasoning:
> > > >
> > > >  - Model exchange format like CoreML and ONNX are not lossless and
> > > > complete. They are designed to an contain a core set of the
> > > > minimum operators to support necessary inference tasks like ResNet,
> > etc.
> > > > So you cannot rely on a bi-directional serialization with this format
> > for
> > > > all MXNet models.  As a simple example, broadcast add/mul is simply
> not
> > > > supported in onnx.
> > > >
> > > > - Same problem goes for compilation and in-memory IR, a core set of
> > most
> > > > interesting primitives are effectively supported.
> > > >
> > > > - Either in the case of supporting exchange format, or in-memory IR,
> we
> > > > need to make the decision on what core set of operators are we
> > interested
> > > > in support.  We cannot simply say let us support everything from the
> > > > beginning due to the limitations of the exchange format.
> > > >
> > > > - It is crucial for us articulate what is the core set of operators
> we
> > > care
> > > > about in MXNet. Either in terms of providing guidelines to the
> > community,
> > > > or influence the design of model exchange format them-selfs to move
> in
> > > > favor of MXNet.
> > > >
> > > > - nnvm/top is that initial core set of operators for both compiler
> > > support
> > > > and exchange purposes. It is modeled under numpy and gluon, under the
> > > > supervision of Eric, Me and Mu.  It can be bi-directionally exchanged
> > > with
> > > > a current mxnet operator without loss of information.
> > > >
> > > > The Effort of Engineering:
> > > >
> > > > - Because nnvm/top is modeled with numpy and gluon, mxnet<-> nnvm/top
> > is
> > > > quite easy, and we already have one direction done. I would be very
> > happy
> > > > to answer any questions on another. No information loss will happen
> > with
> > > > this path.
> > > >
> > > > - mxnet/symbol or nnvm/symbol(they are essentially the same thing
> with
> > a
> > > > bit different op defs) <- onnx is harder. There has been already
> enough
> > > > effort to support onnx 0.1 as Roshani mentioned. Which is contributed
> > by
> > > > Zhi Zhang, another Apache MXNet committer. Zhi already provided code
> to
> > > > alleviate this process. Built code on the existing effort would
> > actually
> > > > make the problem easier.
> > > >
> > > > On Thu, Oct 19, 2017 at 11:55 AM, Tianqi Chen <
> > tqchen@cs.washington.edu>
> > > > wrote:
> > > >
> > > > > As for where the code should sit, we have seen onnx's support for
> > > caffe2
> > > > > sitting on a separate repo.  My suggestion would be put code under
> > > > nnvm/top
> > > > > and migrate into mxnet eventually when the top components get into
> > > MXNet,
> > > > > hopefully by end of next month.
> > > > >
> > > > > I have elaborated my point in the last email thread. This (going
> > > through
> > > > > nnvm/top) is an important design decision both technically
> > > (compilation,
> > > > > more hardware) and strategically (articulate our core set of
> > operators
> > > > and
> > > > > influence the model exchange format).
> > > > >
> > > > > I am glad to see the discussion happening and surely there is
> doubt,
> > as
> > > > > with every big step of changes.  But with the rapidly changing pace
> > of
> > > > deep
> > > > > learning systems, this is the direction that we thought is most
> > > > promising.
> > > > > We can call for a vote if necessary among the committers for the
> > design
> > > > > decision if there is still debate on this issue. Or we can keep the
> > > > > discussion open and start some effort around nnvm/top to see how it
> > > goes
> > > > >
> > > > > Tianqi
> > > > >
> > > > > On Thu, Oct 19, 2017 at 11:15 AM, Lupesko, Hagay <
> lupesko@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> Mu,
> > > > >>
> > > > >> You’re mentioning plans for a new model format and compiler, but I
> > > don’t
> > > > >> recall seeing it shared/discussed on the dev list. Can you share
> > > these,
> > > > so
> > > > >> it is more accessible to folks to understand the plan and vision?
> > > > >>
> > > > >> Personally, I think it will be a shame to add ONNX support to
> MXNet,
> > > and
> > > > >> have it implemented outside of MXNet. At the end of the day, it
> > makes
> > > > >> things difficult for MXNet users.
> > > > >>
> > > > >> Hagay
> > > > >>
> > > > >> On 10/19/17, 10:01, "Mu Li" <limu.cn@gmail.com on behalf of
> > > > >> muli.cmu@gmail.com> wrote:
> > > > >>
> > > > >>     I'm speaking under my "MXNet contributor" hat.
> > > > >>
> > > > >>     It will be sad that our new model format and compiler is not
> > > > >> supported by
> > > > >>     our own contributors. It puts us in a bad position to reach
> out
> > to
> > > > >> outside
> > > > >>     to ask for support.
> > > > >>
> > > > >>     If you really what to do it with the onnx <-> mxnet way, I
> > suggest
> > > > >> putting
> > > > >>     the codes under https://github.com/aws.
> > > > >>
> > > > >>     Best
> > > > >>     Mu
> > > > >>
> > > > >>     On Thu, Oct 19, 2017 at 9:51 AM, Lupesko, Hagay <
> > > lupesko@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>     > Since there seems to be a difficulty to reach a consensus
> > here,
> > > > and
> > > > >> this
> > > > >>     > is a new area, maybe a good compromise would be to
> contribute
> > > this
> > > > >> under
> > > > >>     > /contrib as experimental, with whatever way Roshani thinks
> > makes
> > > > >> sense.
> > > > >>     > Once there is code in place, and MXNet users and
> contributors
> > > are
> > > > >> able to
> > > > >>     > check it out, we can consider future steps.
> > > > >>     >
> > > > >>     > Does this proposal make sense to folks?
> > > > >>     >
> > > > >>     > On 10/18/17, 23:01, "Tianqi Chen" <workcrow@gmail.com on
> > behalf
> > > > of
> > > > >>     > tqchen@cs.washington.edu> wrote:
> > > > >>     >
> > > > >>     >     I want to offer one last thing in terms of technical
> > > details.
> > > > I
> > > > >>     > mentioned
> > > > >>     >     two trends in the deep learning systems. There is one
> last
> > > > >> thing that
> > > > >>     > is
> > > > >>     >     omitted. How should we build a good deploy end for deep
> > > > learning
> > > > >>     > models.
> > > > >>     >
> > > > >>     >     There is always a paradox to this problem:
> > > > >>     >
> > > > >>     >     - On one hand, the deployment end needs to be
> lightweight
> > > and
> > > > >> portable.
> > > > >>     >     - We want a lot of optimizations (memory layout compute)
> > and
> > > > >> feature
> > > > >>     >     support, this makes the project big.
> > > > >>     >
> > > > >>     >     All the existing systems suffer from this problem. The
> > > > solution
> > > > >> is
> > > > >>     > simple,
> > > > >>     >     separates the optimization part from the actual runtime
> > and
> > > > >> compiles
> > > > >>     > the
> > > > >>     >     things down to a bare metal module. And this is the
> > solution
> > > > >> nnvm/top
> > > > >>     >     compiler pipeline offer, which I believe will become a
> > > > standard
> > > > >>     > practice of
> > > > >>     >     deployment and where all systems go to
> > > > >>     >
> > > > >>     >     Tianqi
> > > > >>     >
> > > > >>     >     On Wed, Oct 18, 2017 at 10:03 PM, Tianqi Chen <
> > > > >>     > tqchen@cs.washington.edu>
> > > > >>     >     wrote:
> > > > >>     >
> > > > >>     >     > OK, there is some miscommunication in here I guess.
> We
> > > only
> > > > >> need to
> > > > >>     > do a
> > > > >>     >     > "canonization" step in python API that goes a symbol
> to
> > > > symbol
> > > > >>     > translation
> > > > >>     >     > layer. It can be done in purely in python, and there
> is
> > no
> > > > >> need for
> > > > >>     > going
> > > > >>     >     > "down" into c++ to do this.
> > > > >>     >     >
> > > > >>     >     > For example, the current nnvm.from_mxnet API takes
> > Module
> > > or
> > > > >> Gluon
> > > > >>     > module
> > > > >>     >     > and get you back nnvm/top graph in python.
> > > > >>     >     >
> > > > >>     >     > All we are asking for is to decomposing it into
> > > > >>     >     >
> > > > >>     >     > def mxnet_to_onnx(module):
> > > > >>     >     >    nnvm_graph, params = nnvm_from_mxnet(module)
> > > > >>     >     >    onnx = nnvm_to_onnx(nnvm_graph, params)
> > > > >>     >     >    return onnx
> > > > >>     >     >
> > > > >>     >     > This allows nnvm_from_mxnet to be reused for other
> > > purposes,
> > > > >> like
> > > > >>     >     > compiling API to deployable modules
> > > > >>     >     >
> > > > >>     >     > Tianqi
> > > > >>     >     >
> > > > >>     >     > On Wed, Oct 18, 2017 at 9:55 PM, Lupesko, Hagay <
> > > > >> lupesko@gmail.com>
> > > > >>     > wrote:
> > > > >>     >     >
> > > > >>     >     >> Tianqi:
> > > > >>     >     >> Thanks for detailing the trends. I fully agree that
> > ONNX
> > > is
> > > > >> just a
> > > > >>     > graph
> > > > >>     >     >> serialization format – nothing more, nothing less. I
> > also
> > > > >> think we
> > > > >>     > all
> > > > >>     >     >> agree that this simple mechanism holds lots of value
> to
> > > DL
> > > > >> users
> > > > >>     > since it
> > > > >>     >     >> allows them to move between frameworks easily (e.g.
> > train
> > > > >> with
> > > > >>     > MXNet,
> > > > >>     >     >> deploy on a mobile device with Caffe2, or the other
> way
> > > > >> around).
> > > > >>     >     >> As you said, In Memory IR is different than
> > serialization
> > > > >> formats
> > > > >>     > such as
> > > > >>     >     >> ONNX. They are designed to make the runtime execution
> > as
> > > > >> efficient
> > > > >>     > as
> > > > >>     >     >> possible, leveraging software and hardware
> > optimizations.
> > > > >> They are
> > > > >>     > indeed
> > > > >>     >     >> complex, and where the “meat” is.
> > > > >>     >     >> (BTW ONNX regards itself as an “IR” format, but not
> in
> > > the
> > > > >> same
> > > > >>     > sense as
> > > > >>     >     >> NNVM).
> > > > >>     >     >>
> > > > >>     >     >> At the end of the day, Roshani is aiming to deliver a
> > > > simple
> > > > >>     >     >> functionality to MXNet users: (1) take an ONNX file,
> > and
> > > > >> load it
> > > > >>     > into MXNet
> > > > >>     >     >> so you get a graph+weights you can work with (2)
> Given
> > a
> > > > >> trained
> > > > >>     > model,
> > > > >>     >     >> save it as an ONNX file. Since MXNet users do not
> > > interact
> > > > >> with NNVM
> > > > >>     >     >> directly, but rather interact with MXNet API (MXNet
> > > > Module),
> > > > >> isn’t
> > > > >>     > the
> > > > >>     >     >> simplest thing to do is just to construct the Module
> > “on
> > > > the
> > > > >> fly”
> > > > >>     > using
> > > > >>     >     >> MXNet API? Taking the other approach, we will go from
> > the
> > > > >> top level
> > > > >>     > MXNet
> > > > >>     >     >> “load” API, go “down” to NNVM to construct the graph,
> > go
> > > > >> back up to
> > > > >>     > MXNet
> > > > >>     >     >> to expose it as a Module. This seems to complex and
> > does
> > > > not
> > > > >> add any
> > > > >>     >     >> benefit. In whatever way we construct the MXNet
> Module
> > > > >> object, NNVM
> > > > >>     > will
> > > > >>     >     >> always be the underlying in memory IR that is being
> > > > >> executed, so
> > > > >>     > why not
> > > > >>     >     >> take the simpler route?
> > > > >>     >     >>
> > > > >>     >     >> Hagay
> > > > >>     >     >>
> > > > >>     >     >> On 10/18/17, 19:42, "Tianqi Chen" <
> workcrow@gmail.com
> > on
> > > > >> behalf of
> > > > >>     >     >> tqchen@cs.washington.edu> wrote:
> > > > >>     >     >>
> > > > >>     >     >>     Hi Chris:
> > > > >>     >     >>
> > > > >>     >     >>     There is no intention to move things away from
> > mxnet.
> > > > The
> > > > >>     > reduction of
> > > > >>     >     >>     lines of code by having a better design in
> general,
> > > and
> > > > >>     > usually, you
> > > > >>     >     >> write
> > > > >>     >     >>     less redundant code by benefiting from better
> > design.
> > > > As
> > > > >> I may
> > > > >>     > quote:
> > > > >>     >     >> "the
> > > > >>     >     >>     best design is not achieved not when there is
> > nothing
> > > > to
> > > > >> add,
> > > > >>     > but when
> > > > >>     >     >>     there is nothing to be taken away."
> > > > >>     >     >>
> > > > >>     >     >>     MXNet has always benefited from this philosophy
> and
> > > > >> improves
> > > > >>     > with the
> > > > >>     >     >> new
> > > > >>     >     >>     designs and proper modularization. For example,
> we
> > > see
> > > > >> such
> > > > >>     > reduction
> > > > >>     >     >> and
> > > > >>     >     >>     convenience happening when migrating from MXNet's
> > > > legacy
> > > > >> op to
> > > > >>     > the
> > > > >>     >     >>     NNVM's mechanism. The new mechanism now enables
> > > things
> > > > >> like
> > > > >>     > sparse
> > > > >>     >     >> aware
> > > > >>     >     >>     support and other stuff which would be much
> harder
> > to
> > > > >> support.
> > > > >>     >     >>
> > > > >>     >     >>     The nnvm/tvm stack comes brings the same
> benefit(if
> > > not
> > > > >> more)
> > > > >>     > and it
> > > > >>     >     >> will
> > > > >>     >     >>     only add more features to MXNet itself. Offering
> > more
> > > > >> hardware
> > > > >>     >     >> backends and
> > > > >>     >     >>     optimization, allowing us to write less code and
> > > spent
> > > > >> less
> > > > >>     > time to
> > > > >>     >     >>     optimize for each backend by going through TVM
> > > > >>     >     >>
> > > > >>     >     >>     Tianqi
> > > > >>     >     >>
> > > > >>     >     >>     On Wed, Oct 18, 2017 at 7:15 PM, Chris Olivier <
> > > > >>     > cjolivier01@gmail.com
> > > > >>     >     >> >
> > > > >>     >     >>     wrote:
> > > > >>     >     >>
> > > > >>     >     >>     > Reduce code base of mxnet? By increasing scope
> of
> > > the
> > > > >> dmlc
> > > > >>     > modules?
> > > > >>     >     >> Is the
> > > > >>     >     >>     > intent to make mxnet a thin language wrapper
> > > around a
> > > > >> group
> > > > >>     > of dmlc
> > > > >>     >     >>     > modules?
> > > > >>     >     >>     >
> > > > >>     >     >>     >
> > > > >>     >     >>     > On Wed, Oct 18, 2017 at 6:58 PM Tianqi Chen <
> > > > >>     >     >> tqchen@cs.washington.edu>
> > > > >>     >     >>     > wrote:
> > > > >>     >     >>     >
> > > > >>     >     >>     > > To better answer Hagay's question, I would
> like
> > > to
> > > > >> dive
> > > > >>     > down a
> > > > >>     >     >> bit deeper
> > > > >>     >     >>     > > on the relation between MXNet, NNVM and model
> > > > >> exchange
> > > > >>     > format
> > > > >>     >     >> like ONNX.
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > There are two major trends in deep learning
> > > systems
> > > > >> now:
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > - Common serializable formats, like ONNX and
> > > > CoreML,
> > > > >> that
> > > > >>     > defines
> > > > >>     >     >> the
> > > > >>     >     >>     > model
> > > > >>     >     >>     > > exchange format.
> > > > >>     >     >>     > > - The in-memory graph IR for quick
> optimization
> > > and
> > > > >> JIT.
> > > > >>     > NNVM,
> > > > >>     >     >>     > Tensorflow's
> > > > >>     >     >>     > > XLA falls into this category.
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > The exchange formats are great, it only
> poses a
> > > > >> layer of
> > > > >>     >     >> conversion,
> > > > >>     >     >>     > which
> > > > >>     >     >>     > > is good for exchange. The real meat still
> comes
> > > > from
> > > > >> the
> > > > >>     >     >> compilation and
> > > > >>     >     >>     > > JIT pipeline you have to offer. For that, we
> > will
> > > > >> need an
> > > > >>     >     >> in-memory IR,
> > > > >>     >     >>     > > because of the cost of constructing,
> serialize
> > > > could
> > > > >> be
> > > > >>     > high for
> > > > >>     >     >> the
> > > > >>     >     >>     > > exchange formats like protobuf.  And usually,
> > the
> > > > >> exchange
> > > > >>     >     >> formats are
> > > > >>     >     >>     > > designed in a minimalistic fashion, making it
> > > less
> > > > >> easy to
> > > > >>     > extend
> > > > >>     >     >> more
> > > > >>     >     >>     > > information to support in-depth optimization
> > like
> > > > >> automatic
> > > > >>     >     >> quantization,
> > > > >>     >     >>     > > accelerator support.
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > The current MXNet relies on NNVM for
> in-memory
> > IR
> > > > >>     > manipulation
> > > > >>     >     >> but does
> > > > >>     >     >>     > not
> > > > >>     >     >>     > > contain a compilation component that compiles
> > to
> > > > the
> > > > >>     > hardware
> > > > >>     >     >> backends.
> > > > >>     >     >>     > > Doing export to an exchange format and then
> > back
> > > > >> into NNVM
> > > > >>     > run the
> > > > >>     >     >>     > > compilation poses too much burden that JIT
> > > compiler
> > > > >> could
> > > > >>     > pay.
> > > > >>     >     >> Using the
> > > > >>     >     >>     > > same in-memory graph IR as the compilation
> > stack
> > > > >> give much
> > > > >>     > more
> > > > >>     >     >> advantage
> > > > >>     >     >>     > > in terms of this.
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > The newly introduces nnvm/top and compiler
> > offers
> > > > >> in-memory
> > > > >>     > graph
> > > > >>     >     >>     > > optimization and compilation and offers more
> > > > hardware
> > > > >>     > backend
> > > > >>     >     >> directly
> > > > >>     >     >>     > via
> > > > >>     >     >>     > > TVM. We already see promising results in edge
> > > > >> deployments
> > > > >>     > with a
> > > > >>     >     >> much
> > > > >>     >     >>     > lower
> > > > >>     >     >>     > > overhead of runtime. We will further benefit
> > > > quickly
> > > > >> from
> > > > >>     > more
> > > > >>     >     >> graph
> > > > >>     >     >>     > > optimizations that it has to offer.
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > Building support around this new paradigm
> > offers
> > > us
> > > > >>     > advantage of
> > > > >>     >     >> being
> > > > >>     >     >>     > > future compatible and takes full benefit of
> the
> > > > >> points I
> > > > >>     >     >> mentioned above
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > Tianqi
> > > > >>     >     >>     > >
> > > > >>     >     >>     > >
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > On Wed, Oct 18, 2017 at 4:57 PM, Lupesko,
> > Hagay <
> > > > >>     >     >> lupesko@gmail.com>
> > > > >>     >     >>     > wrote:
> > > > >>     >     >>     > >
> > > > >>     >     >>     > > > Roshani – this is an exciting initiative,
> > ONNX
> > > > >> support on
> > > > >>     > MXNet
> > > > >>     >     >> will
> > > > >>     >     >>     > > > enable more users to ramp up on MXNet,
> which
> > is
> > > > >> great.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > > Tianqi – a few questions and thoughts about
> > > your
> > > > >> note:
> > > > >>     >     >>     > > > - “More hardware backends to mxnet” – MXNet
> > > users
> > > > >> get the
> > > > >>     > same
> > > > >>     >     >> benefit
> > > > >>     >     >>     > of
> > > > >>     >     >>     > > > HW support implementing ONNX import on top
> of
> > > > MXNet
> > > > >>     > symbolic,
> > > > >>     >     >> right?
> > > > >>     >     >>     > > > - “NNVM Compiler now received contributions
> > > from
> > > > >> AWS, UW
> > > > >>     > and
> > > > >>     >     >> many other
> > > > >>     >     >>     > > > folks in MXNet community.” – agreed it is
> > > ramping
> > > > >> up, but
> > > > >>     > when
> > > > >>     >     >> you look
> > > > >>     >     >>     > > at
> > > > >>     >     >>     > > > the data, it is clear that it is very early
> > on
> > > > for
> > > > >> NNVM.
> > > > >>     >     >> Looking at the
> > > > >>     >     >>     > > > repo, it has overall 223 commits, 0
> releases.
> > > > >> Compare it
> > > > >>     > to
> > > > >>     >     >> MXNet with
> > > > >>     >     >>     > > 6136
> > > > >>     >     >>     > > > commits and 32 releases. It seems to be
> still
> > > > >> early on for
> > > > >>     >     >> NNVM, and
> > > > >>     >     >>     > for
> > > > >>     >     >>     > > a
> > > > >>     >     >>     > > > more reliable initial implementation
> building
> > > the
> > > > >> import
> > > > >>     > on top
> > > > >>     >     >> of
> > > > >>     >     >>     > MXNet
> > > > >>     >     >>     > > is
> > > > >>     >     >>     > > > easier, faster and safer. MXNet has lots of
> > > users
> > > > >> already
> > > > >>     > using
> > > > >>     >     >> the
> > > > >>     >     >>     > > > Symbolic API which hopefully mean that is a
> > > > mature
> > > > >> API
> > > > >>     > that is
> > > > >>     >     >> not
> > > > >>     >     >>     > likely
> > > > >>     >     >>     > > > to have breaking changes or major issues.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > > I’m supportive option 1 proposed by Roshani
> > > > >> (building
> > > > >>     > serde on
> > > > >>     >     >> top of
> > > > >>     >     >>     > > > MXNet symbolic), but to do it as an
> > > encapsulated
> > > > >>     > implementation
> > > > >>     >     >> detail,
> > > > >>     >     >>     > > so
> > > > >>     >     >>     > > > the implementation can be migrated to NNVM
> or
> > > > >> another
> > > > >>     >     >> implementation in
> > > > >>     >     >>     > > the
> > > > >>     >     >>     > > > future, if at that point it seems like the
> > > right
> > > > >> thing to
> > > > >>     > do.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > > Interested in hearing other opinions
> though…
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > > Hagay
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > > On 10/18/17, 14:13, "Tianqi Chen" <
> > > > >> workcrow@gmail.com on
> > > > >>     >     >> behalf of
> > > > >>     >     >>     > > > tqchen@cs.washington.edu> wrote:
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     I am strongly recommending going
> through
> > > the
> > > > >>     > nnvm/top. One
> > > > >>     >     >> major
> > > > >>     >     >>     > > > reason in
> > > > >>     >     >>     > > >     here is that the support of nnvm/top
> > layer
> > > > NOT
> > > > >> ONLY
> > > > >>     > mean
> > > > >>     >     >>     > > compatibility
> > > > >>     >     >>     > > > of
> > > > >>     >     >>     > > >     model format with onnx. These are the
> > major
> > > > >> benefits:
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     - More hardware backends to mxnet,
> > > including
> > > > >> opencl,
> > > > >>     > metal,
> > > > >>     >     >>     > Raspberry
> > > > >>     >     >>     > > > Pi,
> > > > >>     >     >>     > > >     web browser. These things are
> > automatically
> > > > >> enabled
> > > > >>     > by going
> > > > >>     >     >>     > through
> > > > >>     >     >>     > > > this
> > > > >>     >     >>     > > >     layer. In general, we design nnvm/tvm
> > stack
> > > > to
> > > > >>     > resolve the
> > > > >>     >     >>     > challenge
> > > > >>     >     >>     > > of
> > > > >>     >     >>     > > >     current mxnet's weakness in terms
> > deploying
> > > > to
> > > > >> more
> > > > >>     > hardware
> > > > >>     >     >>     > > backends.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     - More frontend capabilities, nnvm's
> > gluon
> > > > >> style IR
> > > > >>     > ingests
> > > > >>     >     >> now
> > > > >>     >     >>     > from
> > > > >>     >     >>     > > >     CoreML, ONNX and in future keras.
> > > Supporting
> > > > >> those
> > > > >>     > will
> > > > >>     >     >> reduce the
> > > > >>     >     >>     > > > amount
> > > > >>     >     >>     > > >     of engineering effort needed.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     - Future compatibility. We all agree
> that
> > > the
> > > > >> future
> > > > >>     > being
> > > > >>     >     >> migrated
> > > > >>     >     >>     > > to
> > > > >>     >     >>     > > >     gluon's API. NNVM/top tries to look
> ahead
> > > by
> > > > >> directly
> > > > >>     >     >> adopting the
> > > > >>     >     >>     > > > symbolic
> > > > >>     >     >>     > > >     API to be gluon.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     I would also like to correct some of
> the
> > > > >> mentioned
> > > > >>     > facts
> > > > >>     >     >> with
> > > > >>     >     >>     > regard
> > > > >>     >     >>     > > to
> > > > >>     >     >>     > > >     nnvm/tvm stack
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     1.   Nascent project with few
> > contributors
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     NNVM Compiler now received
> contributions
> > > from
> > > > >> AWS, UW
> > > > >>     > and
> > > > >>     >     >> many
> > > > >>     >     >>     > other
> > > > >>     >     >>     > > > folks
> > > > >>     >     >>     > > >     in MXNet community. NNVM itself is
> > already
> > > > >> being used
> > > > >>     > by
> > > > >>     >     >> MXNet.
> > > > >>     >     >>     > > >     MXNet's internal IR is migrating toward
> > > > gluon,
> > > > >> and its
> > > > >>     >     >> final form
> > > > >>     >     >>     > > being
> > > > >>     >     >>     > > >     nnvm/top
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     3.   Does not support all operators
> that
> > > > exist
> > > > >> in
> > > > >>     > MXNet
> > > > >>     >     >> Symbolic
> > > > >>     >     >>     > API
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     Neither NNVM/top or onnx support all
> > > > operators
> > > > >> that
> > > > >>     > exist
> > > > >>     >     >> in mxnet
> > > > >>     >     >>     > > > symbolic
> > > > >>     >     >>     > > >     API. The end goal here is mainly to
> make
> > > > >> nnvm/top onnx
> > > > >>     >     >> compatible,
> > > > >>     >     >>     > > > which is
> > > > >>     >     >>     > > >     a more reasonable goal.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     4.  No CI Pipeline and testcases
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     NNVM already contains a compiler
> contains
> > > > >> unittests
> > > > >>     > and ci
> > > > >>     >     >> tested
> > > > >>     >     >>     > > with
> > > > >>     >     >>     > > >     integration
> > https://github.com/dmlc/nnvm,
> > > > >> with a CI
> > > > >>     >     >> pipline that
> > > > >>     >     >>     > is
> > > > >>     >     >>     > > > well
> > > > >>     >     >>     > > >     tested on CPU and GPU cases for
> > front-ends.
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     Tianqi
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     On Wed, Oct 18, 2017 at 1:41 PM,
> Roshani
> > > > >> Nagmote <
> > > > >>     >     >>     > > > roshaninagmote2@gmail.com>
> > > > >>     >     >>     > > >     wrote:
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     > Hi guys,
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > I am working on supporting ONNX <
> > > > >>     >     >> https://github.com/onnx/onnx>
> > > > >>     >     >>     > > > pre-trained
> > > > >>     >     >>     > > >     > models in Apache MXNet and would like
> > to
> > > > >> seek your
> > > > >>     >     >> opinion on the
> > > > >>     >     >>     > > > choice of
> > > > >>     >     >>     > > >     > implementation. I also have created a
> > > > GitHub
> > > > >> issue
> > > > >>     >     >>     > > >     > <https://github.com/apache/
> > > > >>     > incubator-mxnet/issues/8319>.
> > > > >>     >     >>     > > Supporting
> > > > >>     >     >>     > > > ONNX
> > > > >>     >     >>     > > >     > in
> > > > >>     >     >>     > > >     > MXNet will enable users to move
> between
> > > > >> frameworks
> > > > >>     > with
> > > > >>     >     >> their
> > > > >>     >     >>     > > > models, this
> > > > >>     >     >>     > > >     > will also enable MXNet project to be
> a
> > > part
> > > > >> of the
> > > > >>     > ONNX
> > > > >>     >     >> open
> > > > >>     >     >>     > > > standard and
> > > > >>     >     >>     > > >     > steer the direction of ONNX.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > For those who don’t know ONNX, ONNX
> is
> > an
> > > > >> open
> > > > >>     > source
> > > > >>     >     >> format for
> > > > >>     >     >>     > AI
> > > > >>     >     >>     > > > models
> > > > >>     >     >>     > > >     > which enables models to be
> transferred
> > > > >> between
> > > > >>     >     >> frameworks. Refer
> > > > >>     >     >>     > to
> > > > >>     >     >>     > > >     > https://github.com/onnx/onnx for
> more
> > > > >> details.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > To implement the import/export
> > > > functionality
> > > > >> in
> > > > >>     > MXNet, I
> > > > >>     >     >> propose
> > > > >>     >     >>     > to
> > > > >>     >     >>     > > > expose
> > > > >>     >     >>     > > >     > a MXNet python module “serde”(name
> > taken
> > > > from
> > > > >>     > Apache Hive
> > > > >>     >     >>     > project)
> > > > >>     >     >>     > > > with the
> > > > >>     >     >>     > > >     > following methods supporting
> different
> > > > >> formats:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > sym, params =
> > > > mxnet.serde.import(other_forma
> > > > >> t_file,
> > > > >>     >     >>     > > > other_format=‘onnx’)
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > other_format_file =
> > > > >> mxnet.serde.export(mxnet_sym,
> > > > >>     >     >> mxnet_params,
> > > > >>     >     >>     > > > ‘onnx’)
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > The implementation under the hood can
> > be
> > > > >> done in
> > > > >>     > two ways:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > 1) Implement at the MXNet layer by
> > > parsing
> > > > >> the ONNX
> > > > >>     >     >> model(in
> > > > >>     >     >>     > > protobuf
> > > > >>     >     >>     > > >     > format) and turn into MXNet Symbolic
> > > > >> operators and
> > > > >>     > build
> > > > >>     >     >> MXNet
> > > > >>     >     >>     > > model
> > > > >>     >     >>     > > >     > directly. Similarly, I can convert
> the
> > > > MXNet
> > > > >> model
> > > > >>     > to
> > > > >>     >     >> ONNX format
> > > > >>     >     >>     > > at
> > > > >>     >     >>     > > > this
> > > > >>     >     >>     > > >     > layer.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > 2) The DMLC community has released
> the
> > > > >> nnvm/tvm
> > > > >>     > complier
> > > > >>     >     >> and an
> > > > >>     >     >>     > > >     > intermediate representation of the
> > > models,
> > > > >> refer:
> > > > >>     >     >>     > > >     > http://www.tvmlang.org/2017/
> > > > >>     > 10/06/nnvm/tvm-compiler-
> > > > >>     >     >>     > > > announcement.html
> > > > >>     >     >>     > > >     > <http://www.tvmlang.org/2017/1
> > > > >> 0/06/nnvm-compiler-
> > > > >>     >     >>     > announcement.html
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Based on the conversation on the
> GitHub
> > > > issue
> > > > >>     >     >>     > > >     > <https://github.com/apache/
> > > > >>     > incubator-mxnet/issues/8319> I
> > > > >>     >     >>     > opened,
> > > > >>     >     >>     > > Mu
> > > > >>     >     >>     > > >     > mentioned that MXNet would use
> nnvm/tvm
> > > as
> > > > >> the
> > > > >>     > backend in
> > > > >>     >     >> the
> > > > >>     >     >>     > > future.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > We could hook into this layer to
> > > implement
> > > > >> the
> > > > >>     >     >> import/export
> > > > >>     >     >>     > > > functionality.
> > > > >>     >     >>     > > >     > nnvm/tvm has ONNX 0.1 version import
> > > > >> implemented.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > For import,
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    I will need to enhance nnvm/tvm’s
> > > > >> importer to
> > > > >>     > support
> > > > >>     >     >> ONNX 0.2
> > > > >>     >     >>     > > >     >    2.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Implement nnvm/tvm->mxnet symbolic
> > > > >> operators.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > For export:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    mxnet->nnvm/tvm ( nnvm/tvm
> provides
> > > this
> > > > >>     > implementation
> > > > >>     >     >>     > already)
> > > > >>     >     >>     > > >     >    2.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    I will need to Implement
> > > nnvm/tvm>onnx.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > These are the pros and cons I see in
> > the
> > > > >> above
> > > > >>     > approaches:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Import/export at mxnet layer
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Pros:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Stable APIs currently used by
> users.
> > > > >>     >     >>     > > >     >    2.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Larger Apache MXNet community of
> > > > >> contributors.
> > > > >>     >     >>     > > >     >    3.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    CI pipeline to catch bugs.
> > > > >>     >     >>     > > >     >    4.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Comparatively less time to
> implement
> > > and
> > > > >> put it
> > > > >>     > in the
> > > > >>     >     >> hands
> > > > >>     >     >>     > of
> > > > >>     >     >>     > > > the
> > > > >>     >     >>     > > >     >    users.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Cons:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    In the future we may have to
> > > reimplement
> > > > >> at the
> > > > >>     >     >> nnvm/tvm
> > > > >>     >     >>     > layer,
> > > > >>     >     >>     > > > in case
> > > > >>     >     >>     > > >     >    MXNet moves to the nnvm/tvm
> > > > >> backend(assuming it
> > > > >>     > will
> > > > >>     >     >> move).
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Import/export at nnvm/tvm layer
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Pros:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Less engineering work in case
> mxnet
> > > > moves
> > > > >> to
> > > > >>     > nnvm/tvm
> > > > >>     >     >>     > > >     >    2.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    nnvm/tvm would become a hub to
> > convert
> > > > to
> > > > >>     > different
> > > > >>     >     >> formats.
> > > > >>     >     >>     > > >     >    3.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    nnvm operators are more in parity
> > with
> > > > >> mxnet’s
> > > > >>     > gluon
> > > > >>     >     >> APIs this
> > > > >>     >     >>     > > > could be
> > > > >>     >     >>     > > >     >    useful in case Gluon becomes the
> > only
> > > > >> standard
> > > > >>     > that
> > > > >>     >     >> MXNet will
> > > > >>     >     >>     > > > support.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Cons:
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    1.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Nascent project with few
> > contributors
> > > > >>     >     >>     > > >     >    2.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Does not support all operators
> that
> > > > exist
> > > > >> in
> > > > >>     > MXNet
> > > > >>     >     >> Symbolic
> > > > >>     >     >>     > API
> > > > >>     >     >>     > > >     >    3.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    No CI Pipeline
> > > > >>     >     >>     > > >     >    4.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    Current Apache MXNet project does
> > not
> > > > use
> > > > >>     > nnvm/tvm
> > > > >>     >     >> backend
> > > > >>     >     >>     > > >     >    5.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >    mxnet->nnvm/tvm backend needs more
> > > > >> testing and
> > > > >>     > user
> > > > >>     >     >> feedback.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Any suggestions on both of these
> > > > approaches?
> > > > >> From
> > > > >>     > user's
> > > > >>     >     >>     > > > perspective, this
> > > > >>     >     >>     > > >     > will be an implementation detail that
> > is
> > > > not
> > > > >>     > exposed.
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Thanks,
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >     > Roshani
> > > > >>     >     >>     > > >     >
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > > >
> > > > >>     >     >>     > >
> > > > >>     >     >>     >
> > > > >>     >     >>
> > > > >>     >     >>
> > > > >>     >     >>
> > > > >>     >     >>
> > > > >>     >     >
> > > > >>     >
> > > > >>     >
> > > > >>     >
> > > > >>     >
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message