mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao Lv <ta...@apache.org>
Subject Re: [Discussion] MXNet 1.5.1 release
Date Wed, 28 Aug 2019 15:36:13 GMT
@Pedro, seems the issue is still open on the master branch. Do you still
think we can have your fix on the 1.5.x branch?

Progress since last update:
1. We received several more proposals in the github thread [1]. I humbly
ask the reporters to pick the fixes to the v1.5.x. I will keep tracking the
progress and the healthy status of the release branch.
2. Thanks to @Lai, the licence issue of julia cat image was fixed on the
master branch and I opened a PR to pick it to v1.5.x [2].
3. The GPU OOM issue was fixed on the master branch by @Lin [3] . But there
is a problem with porting the fix to v1.5.x branch [4].

Opens:
1. https://github.com/apache/incubator-mxnet/pull/15803 still can not pass
the CI;
2. Call for a update from julia folks about the back porting for [5] and [6]
3. License issue of cub and pybind is still open. @Lai opened a PR [7] to
update cub submodule but seems it need more effort than just commit id
update. I suspect that we cannot finish this work in 1.5.1 patch release.
4. Still no progress for the sidebar issue on web page [8].
5. Call for a conclusion about fixing the GPU OOM issue in 1.5.1

Besides, I would like to ask if there is any preference for the release
timeline of 1.5.1 patch release? Please share so I can propose the time for
code freeze.

Thanks,
-tao

[1]  https://github.com/apache/incubator-mxnet/issues/15613.
[2] https://github.com/apache/incubator-mxnet/pull/16026
[3] https://github.com/apache/incubator-mxnet/pull/15948
[4] https://github.com/apache/incubator-mxnet/pull/15999
[5] https://github.com/apache/incubator-mxnet/pull/15609
[6]  https://github.com/apache/incubator-mxnet/pull/15608
[7] https://github.com/apache/incubator-mxnet/pull/15963
[8] https://github.com/apache/incubator-mxnet/issues/15200

On Wed, Aug 28, 2019 at 5:50 AM Pedro Larroy <pedro.larroy.lists@gmail.com>
wrote:

> Ok. I was just asking if we want this fix in 1.5.1 since it addresses
> crashes using multiprocessing. The problem with cherry picking is that the
> patch contains the dynamic load change which shouldn't impact anything else
> but is not supposed to go in a release branch.
>
> On Tue, Aug 27, 2019 at 1:19 PM Lin Yuan <apeforest@gmail.com> wrote:
>
> > https://github.com/apache/incubator-mxnet/pull/15762  contains some
> > unrelated changes which is being reverted. Please do not cherry pick it
> > yet.
> >
> > On Mon, Aug 26, 2019 at 4:25 PM Pedro Larroy <
> pedro.larroy.lists@gmail.com
> > >
> > wrote:
> >
> > > There's a fix that I did which seems to still produce crashes in 1.5
> for
> > > some users, which I got notice today and is fixed in master.
> > >
> > > Might be useful to put in 1.5.1:
> > > https://github.com/apache/incubator-mxnet/pull/15762   ?
> > >
> > > Pedro.
> > >
> > > On Tue, Aug 20, 2019 at 7:49 AM Tao Lv <taolv@apache.org> wrote:
> > >
> > > > Hi dev,
> > > >
> > > > Here is an update for the 1.5.1 patch release.
> > > >
> > > > 1. Thanks for the effort from whole community, we have cherry picked
> a
> > > > bunch of fixes to v1.5.x branch. So far, the branch looks healthy:
> > > >
> > > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/activity/
> > > > 2. https://github.com/apache/incubator-mxnet/pull/15803 cannot pass
> > the
> > > > CI;
> > > > 3. I hope julia folks can take a look at the back porting for
> > > > https://github.com/apache/incubator-mxnet/pull/15609 and
> > > > https://github.com/apache/incubator-mxnet/pull/15608 - do we still
> > need
> > > > them?
> > > > 4. License issue of cub and pybind is still not fixed. We also has a
> > > > license issue of a cat image in julia examples.
> > > > https://github.com/apache/incubator-mxnet/issues/15542
> > > > 5. Still no progress for the sidebar issue:
> > > > https://github.com/apache/incubator-mxnet/issues/15200
> > > > 6. There is a GPU OOM issue in 1.5.0 release and already root caused
> by
> > > > Lin:
> > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/issues/15703#issuecomment-522780492
> > > > .
> > > > We need decide whether we want to get it fixed in the 1.5.1 patch
> > > release.
> > > >
> > > > Please find details in
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Plan+and+Status
> > > > .
> > > >
> > > > Thanks,
> > > > -tao
> > > >
> > > > On Mon, Aug 12, 2019 at 9:57 PM Zhao, Patric <patric.zhao@intel.com>
> > > > wrote:
> > > >
> > > > > Thanks for the explanation, Marco & Tao. Sounds great!
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Tao Lv <taolv@apache.org>
> > > > > > Sent: Monday, August 12, 2019 9:54 PM
> > > > > > To: dev@mxnet.incubator.apache.org
> > > > > > Subject: Re: [Discussion] MXNet 1.5.1 release
> > > > > >
> > > > > > > Regarding the open issue, is there default code
> owner/maintainer?
> > > If
> > > > > > > so, he/she will be the right people to look into the issue.
> > > > > > >
> https://github.com/apache/incubator-mxnet/blob/master/CODEOWNERS
> > > > > > >
> > > > > >
> > > > > > I have no idea. But the CODEOWNERS is used to receive change
> > > > > notificaitons,
> > > > > > not actually indicates the maintainer of a piece of code.
> > > > > >
> > > > > > Do we have regularly build, run, functionality and performance
> > > testing
> > > > > for
> > > > > > > this release?
> > > > > >
> > > > > >
> > > > > > As Marco mentioned, build, run and functionality of v1.5.x branch
> > are
> > > > > tracked
> > > > > > automatically by the CI for each cherry pick pull request and
the
> > > > > nightly tests
> > > > > > here:
> > > > > > http://jenkins.mxnet-ci.amazon-
> > > > > >
> ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/activity
> > .
> > > > > > I see it's healthy so far.
> > > > > >
> > > > > > For performance, Shufan will track CPU performance with his
test
> > > suite
> > > > > and
> > > > > > send out the report once the branch is frozen. I'm not sure
if
> > there
> > > > are
> > > > > any
> > > > > > other performance tests.
> > > > > >
> > > > > > On Mon, Aug 12, 2019 at 9:36 PM Marco de Abreu
> > > > > > <marco.g.abreu@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Patric,
> > > > > > >
> > > > > > > CI should automatically pick up the branch and validate
it as
> > > usual.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Marco
> > > > > > >
> > > > > > > Zhao, Patric <patric.zhao@intel.com> schrieb am Mo.,
12. Aug.
> > > 2019,
> > > > > 15:22:
> > > > > > >
> > > > > > > > It's great works, Tao 😊
> > > > > > > >
> > > > > > > > Regarding the open issue, is there default code
> > owner/maintainer?
> > > > If
> > > > > > > > so, he/she will be the right people to look into the
issue.
> > > > > > > > https://github.com/apache/incubator-
> > > > > > mxnet/blob/master/CODEOWNERS
> > > > > > > >
> > > > > > > > Do we have regularly build, run, functionality and
> performance
> > > > > > > > testing
> > > > > > > for
> > > > > > > > this release?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > --Patric
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Tao Lv <taolv@apache.org>
> > > > > > > > > Sent: Monday, August 12, 2019 8:59 PM
> > > > > > > > > To: dev@mxnet.incubator.apache.org
> > > > > > > > > Subject: Re: [Discussion] MXNet 1.5.1 release
> > > > > > > > >
> > > > > > > > > Update:
> > > > > > > > >
> > > > > > > > > We're cherry picking fixes from the master to
the v1.5.x
> > > branch.
> > > > > > > > > Some
> > > > > > > of
> > > > > > > > > them are already merged. Please find details
on the cwiki
> > page:
> > > > > > > > >
> > > > https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Pl
> > > > > > > > > an+a
> > > > > > > > > nd+Status
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >  There are still 3 opens:
> > > > > > > > > 1. Nightly test failure on CI (
> > > > > > > > > https://github.com/apache/incubator-mxnet/issues/15374):
> The
> > > > issue
> > > > > > > > > is
> > > > > > > > still
> > > > > > > > > open. I'm wondering if it has been fixed or not.
If not, is
> > > there
> > > > > > > anyone
> > > > > > > > > working on it?
> > > > > > > > > 2. Broken Sidebar on website API for master and
1.5.0 (
> > > > > > > > > https://github.com/apache/incubator-mxnet/issues/15200):
I
> > > don't
> > > > > > > > > see
> > > > > > > any
> > > > > > > > > progress on this issue? Do we still want to include
it into
> > > 1.5.1
> > > > > > > > > patch
> > > > > > > > release?
> > > > > > > > > 3. License issues need to be fixed before 1.6
release (
> > > > > > > > > https://github.com/apache/incubator-mxnet/issues/15542):
> > > > Currently
> > > > > > > > > the license issue for code and images is partially
fixed on
> > the
> > > > > > > > > master
> > > > > > > > branch and
> > > > > > > > > will be picked to v1.5.x soon. MKLML license
issue is
> pushed
> > > out
> > > > > > > > > to 1.6 release. But license issue for cub and
pybind is
> still
> > > > open.
> > > > > > > > >
> > > > > > > > > Let me know if you any suggestion. Thanks for
your support!
> > > > > > > > >
> > > > > > > > > -tao
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Aug 7, 2019 at 11:03 PM Tao Lv <taolv@apache.org>
> > > wrote:
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Update:
> > > > > > > > > >
> > > > > > > > > > Thanks to wkcn's report, Issue #15774 [1]
and the fix
> > #15751
> > > > [2]
> > > > > > > > > > are added to the scope of 1.5.1 patch release.
> > > > > > > > > > For issue #15703 [3], I'm still waiting
from the response
> > > from
> > > > > > > > > > the reporter.
> > > > > > > > > > Issue #15431 [4] was closed as false positive
report.
> > > > > > > > > > I also included several MKL-DNN backend
issues reported
> by
> > > > mxnet
> > > > > > > users
> > > > > > > > > > and downstream projects. They are already
fixed on the
> > master
> > > > > > branch.
> > > > > > > > > >
> > > > > > > > > > Please kindly check the full list of issues
need be
> > included
> > > in
> > > > > > > > > > the
> > > > > > > > > > 1.5.1 patch release:
> > > > > > > > > >
> > > > > > > > >
> > > > https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Pl
> > > > > > > > > an+a
> > > > > > > > > > nd+Status
> > > > > > > > > >
> > > > > > > > > > For issues which are already fixed on the
master branch,
> we
> > > > will
> > > > > > > start
> > > > > > > > > > to cherry pick the fix commit to the v1.5.x
branch. For
> > > issues
> > > > > > > > > > which are still open, we will start to track
the fix
> > process.
> > > > > > > > > >
> > > > > > > > > > Thanks for your great support. Let me know
if you have
> any
> > > > > > > > > > questions or concerns.
> > > > > > > > > >
> > > > > > > > > > -tao
> > > > > > > > > >
> > > > > > > > > > [1]
> https://github.com/apache/incubator-mxnet/issues/15774
> > > > > > > > > > [2] https://github.com/apache/incubator-mxnet/pull/15751
> > > > > > > > > > [3]
> https://github.com/apache/incubator-mxnet/issues/15703
> > > > > > > > > > [4]
> https://github.com/apache/incubator-mxnet/issues/15431
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Aug 6, 2019 at 2:04 PM Tao Lv <taolv@apache.org>
> > > > wrote:
> > > > > > > > > >
> > > > > > > > > >>
> > > > > > > > > >> Per Sam's proposal [1], Issue #15737
[2] and the fix [3]
> > are
> > > > > > > > > >> added
> > > > > > > to
> > > > > > > > > >> the scope of 1.5.1 patch release.
> > > > > > > > > >>
> > > > > > > > > >> A friendly reminder: the issue proposing
will be closed
> > > before
> > > > > > > > > >> 11pm
> > > > > > > > > >> 8/7 CST (8am 8/7 PST). After that, we
will start to
> cherry
> > > > pick
> > > > > > > fixes
> > > > > > > > > >> to the v1.5.x branch.
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> [1]
> > > > > > > > > >> https://github.com/apache/incubator-
> > > > > > > > > mxnet/issues/15613#issuecomment-5
> > > > > > > > > >> 18430120 [2]
> > > > > > > > > >> https://github.com/apache/incubator-mxnet/issues/15737
> > > > > > > > > >> [3]
> https://github.com/apache/incubator-mxnet/pull/15692
> > > > > > > > > >>
> > > > > > > > > >> On Thu, Aug 1, 2019 at 4:24 PM Tao Lv
<taolv@apache.org
> >
> > > > wrote:
> > > > > > > > > >>
> > > > > > > > > >>> Hi Sandeep/Lai,
> > > > > > > > > >>>
> > > > > > > > > >>> Thank you for the prompt response!
> > > > > > > > > >>>
> > > > > > > > > >>> https://github.com/apache/incubator-mxnet/issues/15200
> > is
> > > > > > > > > >>> added
> > > > > > > to
> > > > > > > > > >>> the list to track the sidebar issue.
> > > > > > > > > >>>
> > > > > > > > > >>> On Thu, Aug 1, 2019 at 7:54 AM sandeep
krishnamurthy <
> > > > > > > > > >>> sandeep.krishna98@gmail.com>
wrote:
> > > > > > > > > >>>
> > > > > > > > > >>>> Thank you Tao and Shufan.
> > > > > > > > > >>>> Sidebar missing bug in API documentation
is
> > inconvenience
> > > > for
> > > > > > > > > >>>> the
> > > > > > > > user.
> > > > > > > > > >>>> It
> > > > > > > > > >>>> would great if we can fix it
with 1.5.1
> > > > > > > > > >>>>
> > > > > > > > > >>>> On Wed, Jul 31, 2019, 10:14
AM Lai Wei <
> > > royweilai@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > >>>>
> > > > > > > > > >>>> > Hi Tao,
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > Thank you so much for driving
it.  Currently nightly
> > > test
> > > > > > > > > >>>> > on
> > > > > > > > > >>>> tutorials are
> > > > > > > > > >>>> > failing and it need to
be fixed. [3] I have updated
> > the
> > > > > > > > > >>>> > issue[1] and cwiki.[2]
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > [1]
> > > > https://github.com/apache/incubator-mxnet/issues/15613
> > > > > > > > > >>>> > [2]
> > > > > > > > > >>>> >
> > > > > > > > > >>>> >
> > > > > > > > > >>>>
> > > > > > > > >
> > > > https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Pl
> > > > > > > > > a
> > > > > > > > > >>>> n+and+Status
> > > > > > > > > >>>> > [3]
> > > > https://github.com/apache/incubator-mxnet/issues/15374
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > Best Regards
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > Lai
> > > > > > > > > >>>> >
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > On Wed, Jul 31, 2019 at
8:04 AM Tao Lv <
> > > taolv@apache.org>
> > > > > > > wrote:
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > >  Hi community,
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > Thanks for the initiative
from Sam
> > (samskalicky@github
> > > > ),
> > > > > > > > > >>>> > > we already
> > > > > > > > > >>>> > have a
> > > > > > > > > >>>> > > discussion thread
[1] on github about the defects
> > and
> > > > > > > > > >>>> > > bugs exposed
> > > > > > > > > >>>> in the
> > > > > > > > > >>>> > > 1.5.0 release.
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > Shufan (juliusshufan@github)
and I (TaoLv@github)
> > > would
> > > > > > > > > >>>> > > like
> > > > > > > to
> > > > > > > > > >>>> manage
> > > > > > > > > >>>> > the
> > > > > > > > > >>>> > > release of 1.5.1.
This will be our first debut on
> > the
> > > > > > > > > >>>> > > release
> > > > > > > > > >>>> process,
> > > > > > > > > >>>> > your
> > > > > > > > > >>>> > > comments are always
valuable.
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > Per the SemVer 2.0
[2], MXNet 1.5.1 will be a
> patch
> > > > > > > > > >>>> > > release which
> > > > > > > > > >>>> > contains
> > > > > > > > > >>>> > > backwards-compatible
fixes only.
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > I have created a page
on cwiki [3] to track the
> > > release
> > > > > > > process
> > > > > > > > > >>>> > > and
> > > > > > > > > >>>> moved
> > > > > > > > > >>>> > > the issues and PRs
mentioned in the github
> > discussion
> > > > > > > > > >>>> > > thread
> > > > > > > to
> > > > > > > > > >>>> > > the
> > > > > > > > > >>>> page.
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > Here I would like
to ask the community to:
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > (1) Raise any other
defect or regression you
> > > identified
> > > > > > > > > >>>> > > in the
> > > > > > > > > >>>> > > 1.5.0 release. Please
file a github issue for it
> and
> > > > note
> > > > > > > > > >>>> > > the issue
> > > > > > > > > >>>> number in
> > > > > > > > > >>>> > > this thread;
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > (2) Please comment
with one sentence for why you
> > think
> > > > > > > > > >>>> > > the issue is critical
and must have in the 1.5.1
> > > > release;
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > (3) If the issue is
already fixed on master branch
> > or
> > > > > > > > > >>>> > > already have
> > > > > > > > > >>>> a PR
> > > > > > > > > >>>> > > WIP, please also note
the fix commit id or PR
> > number;
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > (4) If the issue is
still open and there is no PR
> > WIP,
> > > > > > > > > >>>> > > please
> > > > > > > > > >>>> indicate
> > > > > > > > > >>>> > > whether you'd be willing
to help it out;
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > (5) Feel free to comment
if any other suggestion
> for
> > > the
> > > > > > > > release.
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > I suggest to keep
this thread open for one week to
> > > > > > > > > >>>> > > collect enough information
and proposals before we
> > > > decide
> > > > > > > > > >>>> > > the timeline for the
> > > > > > > > > >>>> release.
> > > > > > > > > >>>> > So
> > > > > > > > > >>>> > > your timely response
will be highly appreciated!
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > PS: Sorry to say that
even as a committer, this is
> > the
> > > > > > > > > >>>> > > first time
> > > > > > > > > >>>> for me
> > > > > > > > > >>>> > to
> > > > > > > > > >>>> > > manage a release.
So it would be great if an
> > > experienced
> > > > > > > > > >>>> > > committer
> > > > > > > > > >>>> can
> > > > > > > > > >>>> > help
> > > > > > > > > >>>> > > to guide the process.
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > -tao
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > [1]
> > > > > > > > > >>>> > >
> > > https://github.com/apache/incubator-mxnet/issues/15613
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > [2] https://semver.org/
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > [3]
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> >
> > > > > > > > > >>>>
> > > > > > > > >
> > > > https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Pl
> > > > > > > > > a
> > > > > > > > > >>>> n+and+Status
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> >
> > > > > > > > > >>>>
> > > > > > > > > >>>
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message