mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chaitanya Bapat <chai.ba...@gmail.com>
Subject Update : CI windows-gpu Failure
Date Fri, 27 Mar 2020 04:16:15 GMT
Hello MXNet community,

It’s been over 3 days now that windows-gpu builds are failing on CI.
The team (me, Leo, Ningyuan, Joe, Pedro) are at work trying to identify the
root-cause and fix.

Issue: Linker is running OOM due to 32bit toolchain not able to address the
available memory of the machine.

Multiple attempts have been made (albeit with limited success)
1. Reduce the number of builds per worker (for window-cpu node) from 3 to 1
2. Updated the toolchain from 32bit to 64bit (as pointed out by multiple
people)
PR : https://github.com/apache/incubator-mxnet/pull/17916
[related to Leo’s PR : https://github.com/apache/incubator-mxnet/pull/17912)

Road to unblock:
Updated AMI coupled with toolchain should possibly help
Ningyuan has an updated AMI for windows (PR :
https://github.com/apache/incubator-mxnet/pull/17808) - vs2019, cuda10.2,
cmake fixes etc.

We will get it deployed by tomorrow and update the status accordingly.

Thanks for the patience. Apologies for the inconvenience caused.
Thank you 🙏
Chai,
on behalf of the MXNet CI team

-- 
*Chaitanya Prakash Bapat*
*+1 (973) 953-6299*

[image: https://www.linkedin.com//in/chaibapat25]
<https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat]
<https://www.facebook.com/chaibapchya>[image:
https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
https://www.linkedin.com//in/chaibapat25]
<https://www.linkedin.com//in/chaibapchya/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message