mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Ciyong" <ciyong.c...@intel.com>
Subject RE: Updates for 1.7.0 minor release
Date Sat, 13 Jun 2020 13:32:24 GMT
Hi Leonard,

Thanks for your confirmation on the build issue. As it's not a blocker for 1.7 release now,
then we can consider to backport the fix to 1.7.x branch when it's ready.
The only remaining item is how to deal with the multiple license header now, thank you for
helping on this😊

Thanks,
-Ciyong

-----Original Message-----
From: Leonard Lausen <lausen@apache.org> 
Sent: Saturday, June 13, 2020 1:10 AM
To: dev@mxnet.incubator.apache.org
Subject: Re: Updates for 1.7.0 minor release

Thank you Ciyong. After further investigation, the build issue is not as severe as initially
claimed on Github. I checked the high-water memory usage during single-process build: It's
2.7GB on master. On 1.7 release, high-level usage is 2.2GB. This is much more acceptable than
the previously claimed >16GB usage and thus not a blocking issue from my perspective. I'll
later also report the numbers for 1.5 and 1.6.

Fixing the respective implementations to be more compiler-friendly would still be good.

Looking at the parallel-build high-level memory usage on a 96 core machine, I saw a 45% memory
usage increase during build from 1.5 to 1.7.

Best regards
Leonard


On Fri, 2020-06-12 at 02:09 +0000, Chen, Ciyong wrote:
> Hi Chai,
> 
> Sorry for the late update.
> 
> Recently, several bug fixes [4] including numpy operator/batchnorm 
> gradient/LSTM CPU gradient/CI/CD/license issues were back-ported into v1.7.x.
> So far, there's one build issue and two license issues being tracked.
>         1) build issue #18501 (It costs over 16GB memory to compile 
> indexing_op.o), which @leezu stated it's a blocker for the release[1].
>         2) license issue: multiple license header issue[2] is under 
> discussion; no valid apache license header issue[3] is identified, and 
> I'm working on the PR as @szha suggested.
> 
> If the community can help to expedite the item of [1] and [2], it will 
> be great helpful.
> Once we've completed the above items and no more other critical 
> issues, it's ok to cut the rc0.
> 
> Thanks for your patients.
> 
> Thanks,
> -Ciyong
> 
> [1]
> https://github.com/apache/incubator-mxnet/issues/18501#issuecomment-64
> 2785535
> [2]
> https://github.com/apache/incubator-mxnet/issues/17329#issuecomment-64
> 1311199
> [3]
> https://github.com/apache/incubator-mxnet/pull/18478#issuecomment-6424
> 62904
> [4] PR list:
> #18358/#18339/#18311/#18352/#18456/#18316/#18482/#18502/#18517/#18464
> 
> 
> 
> -----Original Message-----
> From: Chaitanya Bapat <chai.bapat@gmail.com>
> Sent: Friday, June 12, 2020 1:34 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: RE: Updates for 1.7.0 minor release
> 
> Hey Ciyong,
> 
> Since the last discussion, the GPU memory regression PR has been reverted.
> Is there any update for when the rc0 for 1.7 will be cut?
> Can the community help expedite the process in any way?
> 
> Thanks
> Chai
> 
> On Wed, 13 May 2020 at 18:28, Chen, Ciyong <ciyong.chen@intel.com> wrote:
> 
> > Hi Ziyi,
> > 
> > Thanks for reaching me for the known/found issue in the upcoming 
> > release, let's fix all these potential issues before dropping the 
> > rc0 tag 😊
> > I'll ask help from Tao to merge the PR.
> > 
> > Thanks,
> > -Ciyong
> > 
> > -----Original Message-----
> > From: Patrick Mu <zm2263@columbia.edu>
> > Sent: Thursday, May 14, 2020 8:58 AM
> > To: dev@mxnet.apache.org
> > Subject: Re: RE: Updates for 1.7.0 minor release
> > 
> > Hi Ciyong,
> > 
> > We found a GPU memory usage regression issue triggered by PR 
> > https://github.com/apache/incubator-mxnet/pull/17767, which was 
> > pushed to both 2.0, 1.x and 1.7 branches
> > 
> > I have reverted this commit in 2.0, but we should revert this in 1.x 
> > and
> > 1.7 branches. I have made a reverting PR on 1.x 
> > https://github.com/apache/incubator-mxnet/pull/18309.
> > 
> > I am thinking if you can help to merge the reverting into 1.x and 
> > 1.7 before making the rc0 tag?
> > 
> > Thanks,
> > Ziyi
> > 
> > On 2020/05/12 00:58:22, "Chen, Ciyong" <ciyong.chen@intel.com> wrote:
> > > Hi Chai,
> > > 
> > > Thanks a lot for your kindly help to fix this 😊
> > > I will continue the rest steps of release process.
> > > 
> > > Thanks,
> > > -Ciyong
> > > 
> > > -----Original Message-----
> > > From: Chaitanya Bapat <chai.bapat@gmail.com>
> > > Sent: Tuesday, May 12, 2020 8:14 AM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: Re: Updates for 1.7.0 minor release
> > > 
> > > Hello Ciyong,
> > > 
> > > With the https://github.com/apache/incubator-mxnet/pull/18261
> > > merged,
> > nightly pipeline passes for 1.7.x So as far as the 2 nightly test 
> > pipelines are concerned [NightlyTests and NightlyTestsForBinaries] 
> > 1.7.x is good to go!
> > > Thanks,
> > > Chai
> > > 
> > > On Sun, 10 May 2020 at 04:53, Chen, Ciyong <ciyong.chen@intel.com>
> > wrote:
> > > > Hi MXNet Community,
> > > > 
> > > > Here's some updates after the code freeze.
> > > > 1. Nightly tests[1] and nightly binaries tests[2] were enabled, 
> > > > many thanks to Chaitanya who helped to create and activate these 
> > > > jobs for v1.7.x branch.
> > > > 2. A nightly test failure (incorrect with_seed path) was fixed 
> > > > by Chaitanya [3] 3. A bug fix for external graph pass by Sam [4] 4.
> > > > Recently, there's another failed cased 
> > > > (test_large_vector.test_nn) in nightly test[5], and Chaitanya is 
> > > > helping to address this issue[6]
> > > > 
> > > > I'll keep monitoring the nightly test before making a rc0 tag.
> > > > Please let me know if you have any other issues that should be 
> > > > included/fixed in this release.
> > > > 
> > > > Thanks,
> > > > -Ciyong
> > > > 
> > > > -----------
> > > > [1]
> > > > http://jenkins.mxnet-ci.amazon-ml.com/view/Nightly%20Tests/job/N
> > > > ig
> > > > ht
> > > > ly
> > > > Tests/job/v1.7.x/
> > > > [2]
> > > > http://jenkins.mxnet-ci.amazon-ml.com/view/Nightly%20Tests/job/N
> > > > ig
> > > > ht
> > > > ly
> > > > TestsForBinaries/job/v1.7.x/ [3]
> > > > https://github.com/apache/incubator-mxnet/pull/18220
> > > > [4] https://github.com/apache/incubator-mxnet/pull/18237
> > > > [5]
> > > > http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTestsForBinarie
> > > > s/ jo b/ v1.7.x/2/execution/node/232/log/ [6]
> > > > https://github.com/apache/incubator-mxnet/pull/18261
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: Chen, Ciyong <ciyong.chen@intel.com>
> > > > Sent: Sunday, April 26, 2020 3:29 PM
> > > > To: dev@mxnet.incubator.apache.org
> > > > Cc: Marco de Abreu <marco.g.abreu@gmail.com>
> > > > Subject: Code freeze for 1.7.0 minor release
> > > > 
> > > > Hi MXNet Community,
> > > > 
> > > > Code freeze for 1.7.0 minor release is in effect (last commit:
> > 38e6634)!
> > > > Which means there're no more NEW features going to be accepted 
> > > > for this release.
> > > > 
> > > > Many thanks to everyone who helped submitting/back 
> > > > porting/reviewing the PRs targeting this release.
> > > > I've created a draft Release Notes for 1.7.0 release[1], please 
> > > > take a review, any comments/suggestions are highly appreciated.
> > > > 
> > > > Currently, the nightly test pipeline [2][3] for v1.7.x is not 
> > > > triggered, cc @Marco de Abreu <marco.g.abreu@gmail.com><mailto:
> > > > marco.g.abreu@gmail.com> to help take a look.
> > > > I will keep monitoring the nightly test result for the current 
> > > > code base, and continue to go through the rest of releasing process.
> > > > 
> > > > [1]
> > > > https://cwiki.apache.org/confluence/display/MXNET/1.7.0+Release+
> > > > No
> > > > te
> > > > s
> > > > [2]
> > > > http://jenkins.mxnet-ci.amazon-ml.com/view/Nightly%20Tests/job/N
> > > > ig
> > > > ht
> > > > ly
> > > > Tests/job/v1.7.x/
> > > > [3]
> > > > http://jenkins.mxnet-ci.amazon-ml.com/view/Nightly%20Tests/job/N
> > > > ig
> > > > ht
> > > > ly
> > > > TestsForBinaries/job/v1.7.x/
> > > > 
> > > > 
> > > > Thanks,
> > > > -Ciyong
> > > > 
> > > > 
> > > 
> > > --
> > > *Chaitanya Prakash Bapat*
> > > *+1 (973) 953-6299*
> > > 
> > > [image: https://www.linkedin.com//in/chaibapat25]
> > > <https://github.com/ChaiBapchya>[image:
> > > https://www.facebook.com/chaibapat]
> > > <https://www.facebook.com/chaibapchya>[image:
> > > https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya
> > > [image:
> > > https://www.linkedin.com//in/chaibapat25]
> > > <https://www.linkedin.com//in/chaibapchya/>
> > > 
> 
> --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
> 
> [image: https://www.linkedin.com//in/chaibapat25]
> <https://github.com/ChaiBapchya>[image: 
> https://www.facebook.com/chaibapat]
> <https://www.facebook.com/chaibapchya>[image:
> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
> https://www.linkedin.com//in/chaibapat25]
> <https://www.linkedin.com//in/chaibapchya/>

Mime
View raw message