mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominic Divakaruni <dominic.divakar...@gmail.com>
Subject Re: MXNet -> Apache Migration proposal
Date Sat, 08 Jul 2017 00:07:07 GMT
great stuff!! glad to see this getting close!

On Fri, Jul 7, 2017 at 3:47 PM, Ly Nguyen <nguyenlyx@gmail.com> wrote:

> We have successfully validated that merges and pull requests against an
> Apache fork of MXNet runs successfully on builds.apache.org:
> https://builds.apache.org/blue/organizations/jenkins/
> incubator-mxnet-master2/detail/master/13/pipeline
> https://builds.apache.org/blue/organizations/jenkins/
> incubator-mxnet-master2/detail/PR-3/1/pipeline
>
> We have also added a dummy nightly run to be populated with builds and test
> cases after migration, as discussed. We can now move forward with the
> migration to Apache and I recommend the following steps:
> - [ ] Add Pono as owner
> - [ ] Pono adds Apache git hooks to MXNet repo
> - [ ] Change source control of Apache Jenkins jobs to point to MXNet repo,
> verify a run is successful
> - [ ] Change MXNet org to Apache, verify a run is successful, mxnet.io
> still building
> - [ ] Start docs build to mxnet.apache.org
> Note that one kink to iron out is that PR build statuses aren’t being
> updated. Here’s a ticket to follow:
> https://issues.apache.org/jira/secure/RapidBoard.jspa?
> rapidView=25&projectKey=INFRA&view=detail&selectedIssue=INFRA-14540
>
>
> On Sat, Jul 1, 2017 at 9:15 PM, shiwen hu <yajiedesign@gmail.com> wrote:
>
> > 1. The `mxnet directory` is a directory on the current CI server. He
> > contains the necessary files, including library dependencies, data files
> > needed for testing, compiling scripts needed, and so on.You can find Mu
> Li
> > and ask him to copy from the current Ci
> > 2.Graphics Driver downlaod from http://www.nvidia.com/
> Download/index.aspx
> > 3.Luanch is a small program. As long as you run it, you should be able to
> > see what to do at a glance
> >
> > 2017-07-02 10:03 GMT+08:00 Naveen Swamy <mnnaveen@gmail.com>:
> >
> > > @yajiedesign
> > > we are building a new slave to be used in Apache Infra, the
> instructions
> > > here https://gist.github.com/yajiedesign/
> 40b3809b51a1706d353e9129071b14
> > fb
> > > to setup a new slave from scratch is insufficient(probably outdated),
> we
> > > ran into quite a bit of problem setting up OpenBlas and OpenCV(those
> > > instructions were missing) though we were able to get through these
> > > problems we anticipate further problems.
> > > since we want to move our Infrastructure to Apache by the end of next
> > week
> > > we have paused the effort of setting up a Windows slave and testing the
> > > Linux slaves that are already setup.
> > >
> > > Is it possible for you to update those instructions? meanwhile, we have
> > > requested Mu Li to create an AMI out of the existing slave.
> > >
> > > Can I also request you to provide instructions on how to create pip
> > package
> > > for Windows? currently, 0.10 version does not have windows pip package?
> > >
> > > Thanks, Naveen
> > >
> > >
> > >
> > > On Sat, Jul 1, 2017 at 12:14 AM, shiwen hu <yajiedesign@gmail.com>
> > wrote:
> > >
> > > > what problem with windows ci?
> > > >
> > > > 2017-07-01 9:06 GMT+08:00 Ly Nguyen <nguyenlyx@gmail.com>:
> > > >
> > > > > This week's summary:
> > > > > 1. Wrote FAQ and publicized CI wiki
> > > > > 2. Plan was to complete migration by end of next week
> > > > >     1. Spent 1.5 days trying to set up Windows slave - was not
> > > successful
> > > > > and would find it more productive to create an AMI from currently
> > > running
> > > > > slaves. Mu says a running Windows slave is not necessary for
> > migration
> > > > but
> > > > > that means we would be losing Windows coverage.
> > > > >     2. The goal for this week was to ensure that PRs, merges,
> > nightlies
> > > > > against the fork trigger builds that pass. There were a lot of
> > hurdles.
> > > > > Many items had to happen in sequence and depended on others’
> > schedules.
> > > > > Namely,
> > > > >         1. accepted invitation to be committer on Monday morning
> > > > >         2. received Apache account Tuesday morning
> > > > >         3. got access to Jenkins & repo Wednesday morning
> > > > >         4. filed tickets for the Infra team to add webhooks which
> was
> > > > > addressed this morning https://issues.apache.org/
> > > jira/browse/INFRA-14472
> > > > >         5. Apache builds of all projects including MXNet’s were
not
> > > > > happening because of some infra issue so there was not much
> traction
> > > > today
> > > > > https://issues.apache.org/jira/browse/INFRA-14476
> > > > >     3. Filed a ticket for support on building docs website
> > > > > https://issues.apache.org/jira/browse/INFRA-14479
> > > > >     4. Filed a ticket to reconfigure donated linux slaves
> > > > > https://issues.apache.org/jira/browse/INFRA-14478
> > > > >
> > > > > On Tue, Jun 27, 2017 at 1:10 PM, Ly Nguyen <nguyenlyx@gmail.com>
> > > wrote:
> > > > >
> > > > > > We are aiming to complete migration of MXNet to Apache by July
> 10.
> > > This
> > > > > > involves transferring the GitHub repo ownership to Apache.
> > > > > >
> > > > > > Migration is tracked at this project board:
> > > > > https://github.com/dmlc/mxnet/
> > > > > > projects/6
> > > > > > As a part of the migration, we also need to adopt the Apache
> > release
> > > > > > process for our next release which is mid-July. This wiki
> > > > > > <https://cwiki.apache.org/confluence/display/MXNET/
> > > > > Continuous+Integration>
> > > > > > gives an overview of of how the process works. It also lists
some
> > > > > > automation tasks that come after the completion of code base
> > > migration
> > > > > and
> > > > > > the next release.
> > > > > >
> > > > > > FAQ:
> > > > > >
> > > > > >    1.
> > > > > >
> > > > > >    Why are we migrating the code base to Apache ownership?
> > > > > >    1.
> > > > > >
> > > > > >       This is one of the steps on graduating from Apache
> > incubation.
> > > > > >       2.
> > > > > >
> > > > > >    When is this happening?
> > > > > >    1.
> > > > > >
> > > > > >       We are aiming for migration to complete by July 10th.
> > > > > >       3.
> > > > > >
> > > > > >    Will my commits/contributions still exist after migration?
> > > > > >    1.
> > > > > >
> > > > > >       Yes. Existing commits will still appear under your existing
> > > > github
> > > > > >       id, and stats will carry over. New commits will also appear
> > > under
> > > > > your
> > > > > >       existing github id, so long as you’ve configured your
> > > > ~/.gitconfig
> > > > > with an
> > > > > >       email address which you’ve linked in your github profile.
> > > > > >       2.
> > > > > >
> > > > > >       Committers will also need to link their Apache ids with
the
> > > > github
> > > > > >       ids to gain write access, in which case, the above answer
> > still
> > > > > applies.
> > > > > >       See #9 on how to link your Apache id.
> > > > > >       4.
> > > > > >
> > > > > >    What will happen to my in flight pull requests?
> > > > > >    1.
> > > > > >
> > > > > >       It will remain intact
> > > > > >       5.
> > > > > >
> > > > > >    Will I still be a member/owner after migration?
> > > > > >    1.
> > > > > >
> > > > > >       Current list of Apache MXNet committers:
> > > > https://wiki.apache.org/
> > > > > >       incubator/MXNetProposal
> > > > > >       2.
> > > > > >
> > > > > >       If you’re not an Apache committer, you lose
> > > membership/ownership
> > > > > >       rights
> > > > > >       3.
> > > > > >
> > > > > >       Apache Infra are the only people with Owner/Admin
> permissions
> > > > there
> > > > > >       4.
> > > > > >
> > > > > >       Apache committers will have write access
> > > > > >       6.
> > > > > >
> > > > > >    What other things will be transferred with the repository?
> > > > > >    1.
> > > > > >
> > > > > >       Wiki, stars, watchers, webhooks, services, deploy keys
> > > > > >       7.
> > > > > >
> > > > > >    What will my fork be associated with after migration?
> > > > > >    1.
> > > > > >
> > > > > >       It will remain associated with the transferred repository
> > > > > >       8.
> > > > > >
> > > > > >    Will I have to change all references to
> > > > http://github.com/dmlc/mxnet
> > > > > ?
> > > > > >    1.
> > > > > >
> > > > > >       All links to http://github.com/dmlc/mxnet will
> automatically
> > > be
> > > > > >       redirected to new location when issuing `git clone`, `git
> > > fetch`,
> > > > > `git
> > > > > >       push`, etc, (as long as we don’t create another “mxnet”
> > > > repository
> > > > > under
> > > > > >       DMLC). However, to avoid confusion, you can change the
> links
> > > > where
> > > > > >       possible, and change remote: `git remote set-url origin
> > > new_url`
> > > > > >       9.
> > > > > >
> > > > > >    How do I gain write access to the repo?
> > > > > >    1.
> > > > > >
> > > > > >       First, you need to be a committer. Then use
> > > > > >       https://gitbox.apache.org/setup/ <
> https://gitbox.apache.org/
> > > > setup/
> > > > > >
> > > > > >       to associate the Apache and GitHub accounts. Note that
all
> > > > > committers will
> > > > > >       need to enable 2-factor authentication on GitHub
> > > > > >       10.
> > > > > >
> > > > > >    Are we also moving mxnet CI? If so, what is the new location?
> > Will
> > > > > >    nightly tests continue to run? How can I add new tests?
> > > > > >    1.
> > > > > >
> > > > > >       We will rely on Apache’s build server to run our builds.
> > > > > >       2.
> > > > > >
> > > > > >       It will first only run unit tests for PRs and merges.
Tests
> > can
> > > > be
> > > > > >       added following the structure setup in
> > > > > >       https://github.com/dmlc/mxnet/blob/master/Jenkinsfile
> > > > > >       <https://github.com/dmlc/mxnet/blob/master/Jenkinsfile>
.
> > > > > >       3.
> > > > > >
> > > > > >       Nightly tests are currently running at
> > > > http://jenkins-master-elb-
> > > > > >       1979848568.us-east-1.elb.amazonaws.com/
> > > > > >       <http://jenkins-master-elb-1979848568.us-east-1.elb.
> > > > amazonaws.com/
> > > > > >
> > > > > >       and will gradually run in Apache’s build server too.
There,
> > we
> > > > > will provide
> > > > > >       artifacts such as pip wheels and source packages for the
> > > > community
> > > > > to test.
> > > > > >       1.
> > > > > >
> > > > > >          Automated releases will happen on
> > > http://jenkins-master-elb-
> > > > > >          1979848568.us-east-1.elb.amazonaws.com/
> > > > > >          <http://jenkins-master-elb-1979848568.us-east-1.elb.
> > > > > amazonaws.com/>
> > > > > >          as Apache’s build doesn’t support key storage.
> > > > > >          11.
> > > > > >
> > > > > >    Is mxnet.io moving too?
> > > > > >    1.
> > > > > >
> > > > > >       For some time we will have both mxnet.apache.org and
> > mxnet.io
> > > > > >       hosting the docs. When we are confident that
> > mxnet.apache.org
> > > is
> > > > > >       stable, we will redirect mxnet.io to there.
> > > > > >
> > > > > >
> > > > > > Link on GitHub repo transfers: https://help.github.com/
> > > > > > articles/about-repository-transfers/
> > > > > >
> > > > > > Feel free to ask any other questions.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Jun 7, 2017 at 12:53 PM, Ly Nguyen <nguyenlyx@gmail.com>
> > > > wrote:
> > > > > >
> > > > > >> I’ve documented the detailed steps below on the process
of
> > migrating
> > > > > >> MXNet -> Apache for open feedback and discussion.
> > > > > >>
> > > > > >> Essentially Amazon will be providing the GPU build slaves
to be
> > > hooked
> > > > > >> into Apache’s Jenkins build Master. We’ll first make
sure that
> > > Apache
> > > > > can
> > > > > >> build a fork of MXNet, before officially transferring ownership
> of
> > > the
> > > > > >> MXNet repo.
> > > > > >>
> > > > > >> Steps to migration:
> > > > > >> 1.      Provide Apache with Linux slaves & slave tags
> > > > > >> a.      Provide Apache with slave configuration (tags, remote
> root
> > > > dir,
> > > > > >> etc.)
> > > > > >> b.      Spin up 6 slaves
> > > > > >> c.       Launch connection via JNLP
> > > > > >> 2.      Apache forks MXNet repo and makes sure builds are
> > successful
> > > > on
> > > > > >> their build set up
> > > > > >> a.      Ask Apache to give me committer rights
> > > > > >> b.      I remove the Windows jobs until a later time
> > > > > >> c.       Apache sets up Jenkins jobs and Github webhooks
> > > > > >>                                                        
      i.
> > > > > >> Build every commit and origin/fork PR’s without merge
(main
> > > > Jenkinsfile)
> > > > > >>                                                        
    ii.
> > > > > >> Nightly job (nightly Jenkins file, will start with a dummy
one
> and
> > > add
> > > > > more
> > > > > >> configurations later)
> > > > > >> d.      If Windows slave setup is available, provide it
to
> Apache
> > > and
> > > > > >> enable the jobs again
> > > > > >> 3.      Transfer the repo and point the build set up there
> > > > > >> 4.      Apache deploys the docs to their website
> > > > > >>
> > > > > >> Open security questions:
> > > > > >> 1.      How can we ensure that our slaves are not used by
other
> > > > > projects?
> > > > > >> a.      It’s not, it’s a social contract.
> > > > > >> 2.      To protect the slave hosts, would running Jenkins
slave
> > > > inside a
> > > > > >> Docker container be a solution, or is there a recommended
best
> > > > practice?
> > > > > >> a.      Run slave behind a NAT gateway and launch via JNLP
> > > > > >> 3.      Does Apache place SSH key inside the build host
for Docs
> > > > > >> deployment to the website? Are there security concerns there?
> > > > > >> a.      The only slaves that are allowed to deploy docs
are
> > > > > >> ASF-controlled. Just provide the build command.
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 


Dominic Divakaruni
206.475.9200 Cell

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message