mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qing Lan <lanking...@live.com>
Subject Re: The operator check for Scala Package
Date Thu, 21 Jun 2018 05:36:00 GMT
Thanks Jun for your clarification. I think one of the advantages for MXNet is it's language
binding. Although the number of customers using Scala is great less than Python, it is still
one important language we need to support. 

About the first question, I would say we all in. At least all breaking changes should let
all language binding maintainers notified. Then we make the changes to make sure we have stable
support to the customers. The reason for Scala build failure could be many, so we need to
diagnose the problem and see if we need to collaborate and solve the problems together. I
don't think the other language maintainers and backend must take the responsibilities to diagnose
a problems of a language they may not familiar with.

2. It depends differently case by case, we cannot bring a unique time for different failures
we have. If you mean the operator check, it could cause you (time to build scalapkg) + (30
seconds to change the code) + (time to run the CI to make sure it worked) + (send a PR to
make that change on master branch). This update should be done by the Scala developers for
sure.

3. If there is a nightly build failure, maybe we can think of checking that on a weekly basis.
Spend 15 mins every week to review the operator changes so we don't miss any important points.

Thanks,
Qing



´╗┐On 6/20/18, 9:57 PM, "Jun Wu" <wujun.nju@gmail.com> wrote:

    We should reach an agreement on the responsibility/ownership before moving
    forward.
    
    1. Who will take the ownership of fixing Scala build failure if an operator
    is changed/added in C++?
    2. What is the expected turn around time of fixing the scala build failure.
    3. What if we are short of resources of fixing the scala build failure? How
    do we deal with the failing nightly CI every night?
    
    With all being said, there is at least one thing that must be clear, we
    certainly don't want to put the extra burden on the backend developers.
    
    On Wed, Jun 20, 2018 at 9:39 PM Qing Lan <lanking520@live.com> wrote:
    
    > Hi All,
    >
    > Thank you all for your feedback. This changes do influence a lot on the
    > PRs related to operator changes. I will take what Marco proposed to place
    > that in the nightly build rather than every CI we runs.
    >
    > Thanks,
    > Qing
    >
    > On 6/20/18, 8:40 PM, "Hao Jin" <hjjn.amzn@gmail.com> wrote:
    >
    >     I don't think we should keep this hash check thing. As Qing stated
    > before
    >     on this thread, if there's any change in documentation the hash value
    > is
    >     also changed and then flagged as a problem, how will that break any
    > user's
    >     code? I would go for something like Marco's proposal: moving this to an
    >     asynchronous check.
    >     At this very moment, I've got 4 PRs that are all related to "operator
    >     changes", adopting this kind of method is adding extra work for me and
    >     every other contributor whose work involves changes on operator code,
    > and I
    >     don't think it's a reasonable kind of extra work like tracking PRs on
    > JIRA.
    >     Hao
    >
    >     On Wed, Jun 20, 2018 at 8:10 PM, Naveen Swamy <mnnaveen@gmail.com>
    > wrote:
    >
    >     > Agreed, for new APIs we can turn into a nightly job and report on
    > dev@.
    >     > The
    >     > goal here is not to burden anyone but to catch changes on the backend
    >     > before it propagates and break user's code and co-ordinate changes
    > across
    >     > language bindings which IMO is essential, so I would like to keep for
    >     > changes on the existing API.
    >     >
    >     > On Wed, Jun 20, 2018 at 7:58 PM, Marco de Abreu <
    >     > marco.g.abreu@googlemail.com.invalid> wrote:
    >     >
    >     > > I think we should go with an asychronous approach using a nightly
    > job
    >     > that
    >     > > detects the changes and reports them accordingly. We could then
    > send out
    >     > a
    >     > > report if there is a mismatch.
    >     > >
    >     > > I agree with the already stated points that we should not put the
    > burden
    >     > of
    >     > > adding frontend API bindings to somebody who adds a Backend
    > function.
    >     > This
    >     > > is not scalable and rather poses a risk in reduced quality or
    > increased
    >     > > review overhead.
    >     > >
    >     > > The sparse support is the perfect example. I don't know if haibin
    > knows
    >     > > Scala, but he would have had to make the entire mapping for Scala
    > without
    >     > > being very familiar with it (sorry if I'm wrong on this haibin,
    > it's just
    >     > > an example). Imagine we do this for even more APIs - we would
    > basically
    >     > > block ourselves because everyone who makes a change in the Backend
    > has to
    >     > > know a dozen languages.
    >     > >
    >     > > While in a few cases it might be just a few lines, I'd guess that
    >     > > especially for new operators it would require a proper design,
    > coding and
    >     > > review to map an API to the frontend languages. This should rather
    > be
    >     > done
    >     > > by an expert in my opinion.
    >     > >
    >     > > We could define it as a requirement for a release that all APIs
    > have had
    >     > to
    >     > > be transfered before a release can be cut (We already have it as
    >     > > requirement that all nightly jobs have to pass anyways).
    >     > >
    >     > > -Marco
    >     > >
    >     > > Naveen Swamy <mnnaveen@gmail.com> schrieb am Mi., 20. Juni 2018,
    > 19:50:
    >     > >
    >     > > > I understand the concerns here. However the problem here is that
    > since
    >     > > the
    >     > > > Scala APIs are generated using Macros and we do not(may not
    > ever) have
    >     > > full
    >     > > > test coverage on each of the APIs, we will not know for example
    > if an
    >     > > > operator/API changes on the backend and that propagates to Scala
    > users,
    >     > > > their code might very well fail.  So what we are trying here is
    > to find
    >     > > out
    >     > > > early if there is an operator/API change and avoid breaking
    > user's
    >     > code.
    >     > > >
    >     > > > I think there is value in co-ordinating addition of APIs across
    >     > bindings,
    >     > > > as an example the sparse APIs introduced was(still not) exposed
    > to
    >     > Scala
    >     > > > users. This begs the question of how do we co-ordinate such
    > changes?
    >     > > Should
    >     > > > we turn addition of new APIs also as warnings ?
    >     > > >
    >     > > > I agree that we shouldn't fail on documentation change, may we
    > should
    >     > > turn
    >     > > > that into warnings and make sure to pick it up on the Scala
    > APIs, this
    >     > is
    >     > > > low risk since it documentation does not change.
    >     > > >
    >     > > > Open for suggestions.
    >     > > >
    >     > > >
    >     > > > On Wed, Jun 20, 2018 at 7:32 PM, YiZhi Liu <eazhi.liu@gmail.com>
    >     > wrote:
    >     > > >
    >     > > > > Hi Qing,
    >     > > > > What is the exact risk of changing / adding operators? could
    > you
    >     > > > > provide an example? I also feel the way you proposed is little
    > bit
    >     > too
    >     > > > > heavy to developers, and not quite friendly to new
    > contributors.
    >     > > > > On Wed, Jun 20, 2018 at 10:22 AM Haibin Lin <
    >     > haibin.lin.aws@gmail.com>
    >     > > > > wrote:
    >     > > > > >
    >     > > > > > I appreciate the effort and understand the motivation.
    > However, I'm
    >     > > > > > concerned that it basically means merging operator PRs
    > becomes
    >     > > > > sequential.
    >     > > > > > Developers who work on operators have to update their
PR
    > every
    >     > time a
    >     > > > new
    >     > > > > > operator is merged to master, the burden becomes significant
    > if
    >     > > > there're
    >     > > > > 20
    >     > > > > > ONNX/sparse operators to add and many PRs are
    > submitted/reviewed in
    >     > > > > > parallel.
    >     > > > > >
    >     > > > > > On Wed, Jun 20, 2018 at 10:13 AM, Qing Lan <
    > lanking520@live.com>
    >     > > > wrote:
    >     > > > > >
    >     > > > > > > Hi Haibin,
    >     > > > > > >
    >     > > > > > > The operator change is any changes on the operator
on C++
    > side.
    >     > > > Trigger
    >     > > > > > > the check failed?
    >     > > > > > >    - change the documentation of operator in C
    >     > > > >   Yes
    >     > > > > > >    - change the documentation such as README.md
    >     >  No
    >     > > > > > >    - add/remove/modify operator
    >     >  Yes
    >     > > > > > >    - add/remove/modify operator parameter
    >     > > > >  Yes
    >     > > > > > >
    >     > > > > > > Thanks,
    >     > > > > > > Qing
    >     > > > > > >
    >     > > > > > > On 6/20/18, 10:01 AM, "Haibin Lin" <
    > haibin.lin.aws@gmail.com>
    >     > > > wrote:
    >     > > > > > >
    >     > > > > > >     Could you elaborate what you mean by operator
change?
    > Does it
    >     > > > > check the
    >     > > > > > >     operator interface? Would updated operator
    > documentation fail
    >     > > the
    >     > > > > > > check?
    >     > > > > > >     Would adding a new operator fail this check?
    >     > > > > > >
    >     > > > > > >
    >     > > > > > >
    >     > > > > > >     On Wed, Jun 20, 2018 at 9:48 AM, Qing Lan <
    >     > lanking520@live.com
    >     > > >
    >     > > > > wrote:
    >     > > > > > >
    >     > > > > > >     > Hi Macro,
    >     > > > > > >     >
    >     > > > > > >     > Thanks for your feedback! I believe this
should not
    > be a
    >     > > block
    >     > > > > for
    >     > > > > > >     > contributor in most of the cases.
    >     > > > > > >     > Firstly, this would only be triggered
if there is an
    >     > operator
    >     > > > > changes
    >     > > > > > >     > (Only general operators).
    >     > > > > > >     > Secondly, it is simple to go through.
Just need to
    > change 1
    >     > > > line
    >     > > > > of
    >     > > > > > > code
    >     > > > > > >     > would make the PR passed. However it do
requires
    >     > contributor
    >     > > to
    >     > > > > do
    >     > > > > > > this or
    >     > > > > > >     > the Scalatest will fail. I have made the
error
    > message
    >     > > > > instructive
    >     > > > > > > that
    >     > > > > > >     > would help the contributor to dive in
and make the
    > changes.
    >     > > > > > >     >
    >     > > > > > >     > I also updated the design document to
explain in
    > detail.
    >     > > > > > >     >
    >     > > > > > >     > Thanks,
    >     > > > > > >     > Qing
    >     > > > > > >     >
    >     > > > > > >     >
    >     > > > > > >     > On 6/19/18, 12:09 PM, "Marco de Abreu"
<
    >     > > > > marco.g.abreu@googlemail.com
    >     > > > > > > .INVALID>
    >     > > > > > >     > wrote:
    >     > > > > > >     >
    >     > > > > > >     >     Okay, thanks for elaborating. I definitely
see
    > your
    >     > point
    >     > > > > there
    >     > > > > > > and we
    >     > > > > > >     >     definitely don't want these changes
to pile up.
    >     > > > > > >     >
    >     > > > > > >     >     I don't feel strongly about this and
won't stand
    > in the
    >     > > > way,
    >     > > > > I
    >     > > > > > > just
    >     > > > > > >     > want to
    >     > > > > > >     >     express my concern that this could
lead to people
    >     > having
    >     > > to
    >     > > > > > > touch all
    >     > > > > > >     >     language interfaces although they
might not
    > familiar
    >     > with
    >     > > > > them
    >     > > > > > > at all.
    >     > > > > > >     > On
    >     > > > > > >     >     the other hand we got enough contributors
who
    > could
    >     > help
    >     > > > them
    >     > > > > > > then
    >     > > > > > >     > before
    >     > > > > > >     >     the PR can get merged. So either way
works, but
    > I just
    >     > > > > wanted to
    >     > > > > > >     > highlight
    >     > > > > > >     >     that this could make it harder to
make changes
    > in the
    >     > > > > backend for
    >     > > > > > >     > people
    >     > > > > > >     >     who are not familiar with our frontend
API
    > languages.
    >     > If
    >     > > we
    >     > > > > got
    >     > > > > > > enough
    >     > > > > > >     >     people who could actively support
our
    > contributors in
    >     > > such
    >     > > > a
    >     > > > > > > case, we
    >     > > > > > >     >     should be totally fine with blocking
a PR until
    > the
    >     > APIs
    >     > > > have
    >     > > > > > > been
    >     > > > > > >     > adapted.
    >     > > > > > >     >
    >     > > > > > >     >     -Marco
    >     > > > > > >     >
    >     > > > > > >     >     On Tue, Jun 19, 2018 at 11:58 AM Naveen
Swamy <
    >     > > > > > > mnnaveen@gmail.com>
    >     > > > > > >     > wrote:
    >     > > > > > >     >
    >     > > > > > >     >     > Marco,
    >     > > > > > >     >     >
    >     > > > > > >     >     > Qing and I are working together
on this. The
    > idea is
    >     > > that
    >     > > > > we
    >     > > > > > > fail
    >     > > > > > >     > the build
    >     > > > > > >     >     > if there is a operator change
on the backend
    > and have
    >     > > not
    >     > > > > > > synced to
    >     > > > > > >     > the
    >     > > > > > >     >     > Scala API. We want to catch this
before
    > breaking the
    >     > > > user's
    >     > > > > > > code
    >     > > > > > >     > which will
    >     > > > > > >     >     > be a pretty bad experience.
    >     > > > > > >     >     >
    >     > > > > > >     >     >
    >     > > > > > >     >     >
    >     > > > > > >     >     > On Tue, Jun 19, 2018 at 11:54
AM, Marco de
    > Abreu <
    >     > > > > > >     >     > marco.g.abreu@googlemail.com.invalid>
wrote:
    >     > > > > > >     >     >
    >     > > > > > >     >     > > Hi Qing,
    >     > > > > > >     >     > >
    >     > > > > > >     >     > > thank you for working on
improving the
    >     > compatibility
    >     > > of
    >     > > > > our
    >     > > > > > > APIs!
    >     > > > > > >     >     > >
    >     > > > > > >     >     > > Your linked proposal does
not describe the
    >     > mentioned
    >     > > > > > > FILEHASH.
    >     > > > > > >     > Could you
    >     > > > > > >     >     > > elaborate a bit? Would this
be a hash of the
    > entire
    >     > > > file,
    >     > > > > > > some hash
    >     > > > > > >     >     > created
    >     > > > > > >     >     > > based on the signature of
the underlying C++
    >     > methods
    >     > > or
    >     > > > > > > maybe a
    >     > > > > > >     > different
    >     > > > > > >     >     > > approach?
    >     > > > > > >     >     > >
    >     > > > > > >     >     > > Also, at which step would
developers be
    > notified of
    >     > > the
    >     > > > > > > change? I'd
    >     > > > > > >     >     > propose
    >     > > > > > >     >     > > that we make this check
a nightly job to
    > prevent it
    >     > > > from
    >     > > > > > > blocking
    >     > > > > > >     > a PR
    >     > > > > > >     >     > and
    >     > > > > > >     >     > > forcing contributors who
are not familiar
    > with
    >     > Scala
    >     > > > > having
    >     > > > > > > to
    >     > > > > > >     > make a
    >     > > > > > >     >     > > change to the Scala package.
This would
    > allow us to
    >     > > > > follow up
    >     > > > > > >     >     > > asynchronously but still
provide actionable
    > events
    >     > > that
    >     > > > > one
    >     > > > > > > can be
    >     > > > > > >     >     > > subscribed to.
    >     > > > > > >     >     > >
    >     > > > > > >     >     > > Best regards,
    >     > > > > > >     >     > > Marco
    >     > > > > > >     >     > >
    >     > > > > > >     >     > > On Tue, Jun 19, 2018 at
11:00 AM Qing Lan <
    >     > > > > > > lanking520@live.com>
    >     > > > > > >     > wrote:
    >     > > > > > >     >     > >
    >     > > > > > >     >     > > > Hi all,
    >     > > > > > >     >     > > >
    >     > > > > > >     >     > > > I am one of the maintainer
for MXNet Scala
    >     > package.
    >     > > > > > > Currently I
    >     > > > > > >     > am
    >     > > > > > >     >     > > > building up a hash-check
system on the
    > generated
    >     > > API
    >     > > > > > > through C.
    >     > > > > > >     > The PR
    >     > > > > > >     >     > > is
    >     > > > > > >     >     > > > in this URL:
    >     > > > > > >     >     > > > https://github.com/apache/
    >     > > incubator-mxnet/pull/11239
    >     > > > > > >     >     > > > A file named FILEHASH
will be added to the
    > Scala
    >     > > that
    >     > > > > > > created
    >     > > > > > >     > the MD5
    >     > > > > > >     >     > > > string of the generated
API document. It
    > will
    >     > check
    >     > > > the
    >     > > > > > >     > signature of
    >     > > > > > >     >     > all
    >     > > > > > >     >     > > > the operators during
Scala CI testing. The
    >     > reason I
    >     > > > am
    >     > > > > > > doing
    >     > > > > > >     > this is to
    >     > > > > > >     >     > > > make sure Scala developers
will always be
    >     > reminded
    >     > > if
    >     > > > > > > there is an
    >     > > > > > >     >     > > operator
    >     > > > > > >     >     > > > name/argument changes
happened in
    > different PRs.
    >     > > More
    >     > > > > > > detailed
    >     > > > > > >     > info
    >     > > > > > >     >     > > > explained in here:
    >     > > > > > >     >     > > >
    >     > > > > > >     >     > > > https://cwiki.apache.org/
    >     > confluence/display/MXNET/
    >     > > > > > >     >     > > MXNet+Scala+API+Usability+Improvement
    >     > > > > > >     >     > > >
    >     > > > > > >     >     > > > Pros:
    >     > > > > > >     >     > > > This method will always
help us keep
    > backwards
    >     > > > > > > compatibility of
    >     > > > > > >     >     > operators
    >     > > > > > >     >     > > > for Scala
    >     > > > > > >     >     > > > Cons:
    >     > > > > > >     >     > > > Require update on the
FILEHASH with new
    > contents
    >     > > when
    >     > > > > > > there is an
    >     > > > > > >     >     > > operator
    >     > > > > > >     >     > > > change. Developers
who changed the operator
    >     > should
    >     > > > > `make
    >     > > > > > >     > scalapkg` to
    >     > > > > > >     >     > > > update that file.
    >     > > > > > >     >     > > >
    >     > > > > > >     >     > > > Please leave any thoughts
you may have for
    > this
    >     > > > design
    >     > > > > and
    >     > > > > > > feel
    >     > > > > > >     > free to
    >     > > > > > >     >     > > > review the code.
    >     > > > > > >     >     > > >
    >     > > > > > >     >     > > > Thanks,
    >     > > > > > >     >     > > > Qing
    >     > > > > > >     >     > > >
    >     > > > > > >     >     > > >
    >     > > > > > >     >     > >
    >     > > > > > >     >     >
    >     > > > > > >     >
    >     > > > > > >     >
    >     > > > > > >     >
    >     > > > > > >
    >     > > > > > >
    >     > > > > > >
    >     > > > >
    >     > > > >
    >     > > > >
    >     > > > > --
    >     > > > > Yizhi Liu
    >     > > > > DMLC member
    >     > > > > Amazon Web Services
    >     > > > > Vancouver, Canada
    >     > > > >
    >     > > >
    >     > >
    >     >
    >
    >
    >
    

Mime
View raw message