mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <>
Subject CI impaired
Date Wed, 21 Nov 2018 04:24:20 GMT

I'd like to let you know that our CI was impaired and down for the last few
hours. After getting the CI back up, I noticed that our auto scaling broke
due to a silent update of Jenkins which broke our upscale-detection. Manual
scaling is currently not possible and stopping the scaling won't help
either because there are currently no p3 instances available, which means
that all jobs will fail none the less. In a few hours, the auto scaling
will have recycled all slaves through the down-scale mechanism and we will
be out of capacity. This will lead to resource starvation and thus timeouts.

Your PRs will be properly registered by Jenkins, but please expect the jobs
to time out and thus fail your PRs.

I will fix the auto scaling as soon as I'm awake again.

Sorry for the caused inconveniences.

Best regards,

P.S. Sorry for the brief email and my lack of further fixes, but it's
5:30AM now and I've been working for 17 hours.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message