mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carin Meier <carinme...@gmail.com>
Subject Re: Some feedback from MXNet Zhihu topic
Date Thu, 20 Sep 2018 13:15:24 GMT
Totally agree about the potential huge benefit of having new research
papers having implementation examples in MXNet. Wondering if anyone had any
brainstorm ideas about how to facilitate/ encourage this?

Also wanted to note that I think the recent progress and attention to
stability will help to both speed the PR process and release cycle. There
is more work to do in this area especially in regards to automation of the
release that I think will yield big dividends down the road. Let's keep up
the good work in this area.

- Carin

On Thu, Sep 20, 2018 at 4:10 AM Naveen Swamy <mnnaveen@gmail.com> wrote:

> Qing,
>
> this is so loaded and very specific suggestions. Thank you for bringing up
> here, since Apache MXNet is popular in China, It would be great if Mandrin
> speaking developers here could bring such feedback and user pain to the
> community's attention.
>
> 1. To capture specific API/Example/Tutorial that users have an issue on, Mu
> suggested in the past to add thumbs up/down on the website:
> https://issues.apache.org/jira/browse/MXNET-972
>
> 6. The heavy code base is not because of the code in the MXNet repo, its
> all the sub-modules that are added to the repo - I have had this problem
> too, to build MXNet i have to fetch and build the whole world that MXNet
> depends on and its dependency(sub within sub) - I think its time to revisit
> and refactor.
>
> For others I suggest you work with someone to create actionable JIRAs(may
> be Denis - because he knowledgable JIRA and creates nice actionable
> stories), it would be nice if these stories can contain many
> first-good-issue tasks for new contributors to pick up - creating
> standalone examples(from existing) is a great one for newbies to learn
> MXNet and contribute back.
>
> Examples are very important for someone to not only quickly learn but also
> extend/adopt to their own application, In Scala we(you) have added tests
> around Examples and actually use them as integration tests - we should do
> insist the same for new examples written or old examples that we touch .
>
> In Deep Learning what is more critical and could increase rapid adoption is
> to have the latest and greatest papers implemented as examples - this is a
> call for suggestions and Action to the community.
>
> Thanks, Naveen
>
>
> On Wed, Sep 19, 2018 at 10:39 PM, Aaron Markham <aaron.s.markham@gmail.com
> >
> wrote:
>
> > Thanks for this translation and feedback Qing!
> > I've addressed point 3 of the documentation feedback with this PR:
> > https://github.com/apache/incubator-mxnet/pull/12604
> > I'm not sure how to take the first two points without some explicit URLs
> > and examples, so if anyone has those I'd be happy to take a look if
> there's
> > some glitch vs missing or wrong docs.
> >
> > Also, I would agree that there should be some more simple examples. Often
> > times the examples are too complicated and unclear about what is
> important
> > or not. The audience targeting is for deep learning practitioners, not
> > "newbies".
> >
> > And on a related note, I'd really like to pull the Gluon stuff into the
> API
> > section. It's confusing as its own navigation item and orphaned
> > information. It could have a navigation entry at the top of the API list
> > like "Python: Gluon" or just "Gluon" then list "Python: Module" or just
> > "Python". Or running this the other way, the Gluon menu could have API
> and
> > Tutorials and be more fleshed out, though this is not my preference.
> Either
> > way, it needs some attention.
> >
> > Cheers,
> > Aaron
> >
> > On Wed, Sep 19, 2018 at 11:04 AM Qing Lan <lanking520@live.com> wrote:
> >
> > > Hi all,
> > >
> > > There was a trend topic<https://www.zhihu.com/question/293996867> in
> > > Zhihu (a famous Chinese Stackoverflow+Quora) asking about the status of
> > > MXNet in 2018 recently. Mu replied the thread and obtained more than
> 300+
> > > `like`.
> > > However there are a few concerns addressed in the comments of this
> > thread,
> > > I have done some simple translation from Chinese to English:
> > >
> > > 1. Documentations! Until now, the online doc still contains:
> > >                 1. Depreciated but not updated doc
> > >                 2. Wrong documentation with poor description
> > >                 3. Document in Alpha stage such as you must install
> `pip
> > > –pre` in order to run.
> > >
> > > 2. Examples! For Gluon specifically, many examples are still mixing
> > > Gluon/MXNet apis. The mixure of mx.sym, mx.nd mx.gluon confused the
> users
> > > of what is the right one to choose in order to get their model to work.
> > As
> > > an example, Although Gluon made data encapsulation possible, still
> there
> > > are examples using mxn.io.ImageRecordIter with tens of params (feels
> like
> > > gluon examples are simply the copy from old Python examples).
> > >
> > > 3. Examples again! Comparing to PyTorch, there are a few examples I
> don't
> > > like in Gluon:
> > >                 1. Available to run however the code structure is still
> > > very complicated. Such as example/image-classification/cifar10.py. It
> > > seemed like a consecutive code concatenation. In fact, these are just a
> > > series of layers mixed with model.fit. It makes user very hard to
> > > modify/extend the model.
> > >                 2. Only available to run with certain settings. If
> users
> > > try to change a little bit in the model, crashes will happen. For
> > example,
> > > the multi-gpu example in Gluon website, MXNet hide the logic that using
> > > batch size to change learning rate in a optimizer. A lot of newbies
> > didn't
> > > know this fact and they would only find that the model stopped
> converging
> > > when batch size changed.
> > >                 3. The worst scenario is the model itself just simply
> > > didn't work. Maintainers in the MXNet community didn't run the model
> > (even
> > > no integration test) and merge the code directly. It makes the script
> not
> > > able run till somebody raise the issues and fix it.
> > >
> > > 4. The Community problem. The core advantage for MXNet is it's
> > scalability
> > > and efficiency. However, the documentation of some tools are confusing.
> > > Here are two examples:
> > >
> > >                 1. im2rec contains 2 versions, C++ (binary) and python.
> > > But nobody would thought that the argparse in these tools are different
> > (in
> > > the meantime, there is no appropriate examples to compare with, users
> > could
> > > only use them by guessing the usage).
> > >
> > >                 2. How to combine MXNet distributed platform with
> > > supercomputing tool such as Slurm? How do we do profiling and how to
> > debug.
> > > A couples of companies I knew thought of using MXNet for distributed
> > > training. Due to lack of examples and poor support from the community,
> > they
> > > have to change their models into TensorFlow and Horovod.
> > >
> > > 5. The heavy code base. Most of the MXNet examples/source
> > > code/documentation/language binding are in a single repo. A git clone
> > > operation will cost tens of Mb. The New feature PR would takes longer
> > time
> > > than expected. The poor reviewing response / rules keeps new
> contributors
> > > away from the community. I remember there was a call for
> > > document-improvement last year. The total timeline cost a user 3 months
> > of
> > > time to merge into master. It almost equals to a release interval of
> > > Pytorch.
> > >
> > > 6. To Developers. There are very few people in the community discussed
> > the
> > > improvement we can take to make MXNet more user-friendly. It's been so
> > easy
> > > to trigger tens of stack issues during coding. Again, is that a
> > requirement
> > > for MXNet users to be familiar with C++? The connection between Python
> > and
> > > C lacks a IDE lint (maybe MXNet assume every developers as a VIM
> master).
> > > API/underlying implementation chaged frequently. People have to release
> > > their code with an achieved version of MXNet (such as TuSimple and
> MSRA).
> > > Let's take a look at PyTorch, an API used move tensor to device would
> > raise
> > > a thorough discussion.
> > >
> > > There will be more comments translated to English and I will keep this
> > > thread updated…
> > > Thanks,
> > > Qing
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message