mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Markham <aaron.s.mark...@gmail.com>
Subject Re: Update on upcoming changes to the MXNet CI: Jenkins
Date Thu, 13 Feb 2020 17:20:50 GMT
+1 These are good action items that should help alleviate part of the
CI issues.

The following comments are not to take away from your proposal. Move
forward, assuming the community agrees.
I'd really like to see particular tests run only when the PR is
touching a related part. While this is more effort, it would really
make a major difference. Light research shows that projects have been
doing this for quite some time, so it wouldn't be a new invention and
deep exploration.

I realize there are a lot of interdependencies and it would probably
not work for everything. But, what if we start small?
--> Docs pages (*.rst, *.md, *.html, *.js, *.css): don't trigger most
tests, especially GPU and cross-platform tests.
--> Tutorials that have GPU requirements run their own validation
tests, and tutorials that don't have GPU requirement don't get tested
on GPUs.

Cheers,
Aaron



On Wed, Feb 12, 2020 at 10:12 AM Davydenko, Denis
<dzianis.davydzenka@gmail.com> wrote:
>
> Hello, MXNet dev community,
> As you all know, the experience with CI infrastructure isn’t ideal in spite of its
high cost. For this reason, we’re proposing the following changes to improve stability,
reduce cost, and grant more control to contributors. As we work in a refresh of CI, we believe
these changes will reduce the pain we all suffer when we try to push a PR through the system.
>
> Following is the list of changes:
> Fix missing status reports between GH and Jenkins
> Update Jenkins permission groups to re-trigger builds
> Introduce per-PR CI bot
> Details:
>
> - Fix missing status reports
> Currently, once commit gets added to PR - the CI is run on that added commit. Sometimes,
CI run status is missing from the commit in Github despite having completed in Jenkins. Example:
CI run: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-17376/17/pipeline,
commit status in github (missing unix-cpu, unix-gpu and windows-gpu statuses): https://github.com/apache/incubator-mxnet/pull/17376#partial-pull-merging.
> Problem: There seems to be a bug where some status reports are missing on Github. The
hypothesis is that there is some issue with Github Hooks.
>
> - Update Jenkins permission groups to re-trigger builds
> Problem: Currently, only MXNet Committers and selected people from AWS have the ability
to re-trigger CI runs on PRs. This leaves the PR Authors waiting for authorized users to re-trigger
their PRs for them.
> Solution : Allow these membership categories Jenkins Admins, MXNet Committers, and PR
Authors to re-trigger PR builds.
>
> - Introduce per-PR CI bot
> Problem: As of date, MXNet CI is automated. It runs every time a commit is pushed onto
your Github PR. This results in lot of unnecessary CI runs apart from added costs.
> Solution: Switch to Manual Trigger. Users from authorized groups (1 of the 3 categories
mentioned above) can trigger CI run by adding a simple comment to PR: “[mxnet-ci] run”.
>
> --
> Thank you,
>
> AWS MXNet team
>
>
>

Mime
View raw message