mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kellen sunderland <kellen.sunderl...@gmail.com>
Subject Re: [DISCUSS] Build OSX builds in CI (possibly with TravisCI).
Date Wed, 05 Sep 2018 10:01:44 GMT
I'm bumping this thread as we've recently had our first serious bug on
MacOS that would have been caught by enabling Travis.

I'm going to do a little experimental work together with Marco with the
goal of enabling a minimal Travis build that will run python tests.  So far
I've verified that Travis will in fact find a bug that currently exists in
master and has been reproduced by MacOS clients.  This indicates to me that
adding Travis will add value to our CI.

My best guess is that it might take us some iteration before we find a
scalable way to integrate Travis.  Given this we're going to enable Travis
in non-blocking mode (i.e. failures are safe to ignore for the time being).

To help mitigate the risk of timeouts, and to remove legacy code I'm going
to re-create the travis.yml file from scratch.  I think it'll be much less
confusing if we only have working code related to Travis in our codebase,
so that contributors won't have to experiment to see what is or isn't
working.  We've got some great, but slightly out-of-date functionality in
the legacy .travis.yml file.  I hope we can work together to update the
legacy features, ensure they work with the current folder structure and
also make sure the features run within Travis's 45 minute global time
window.

I'd also like to set expectations that this is strictly a volunteer
effort.  I'd welcome help from the community for support and maintenance.
The model downloading caching work particularly stands out to me as
something I'd like to re-enable again as soon as possible.

-Kellen

On Tue, Jan 9, 2018 at 11:52 AM Marco de Abreu <marco.g.abreu@googlemail.com>
wrote:

> Looks good! +1
>
> On Tue, Jan 9, 2018 at 10:24 AM, kellen sunderland <
> kellen.sunderland@gmail.com> wrote:
>
> > I think most were in favour of at a minimum creating a clang build so
> I've
> > created a PR
> > https://github.com/apache/incubator-mxnet/pull/9330/commits/
> > 84089ea14123ebe4d66cc92e82a2d529cfbd8b19.
> > My hope is this will catch many of the issues blocking OSX builds.  In
> fact
> > it already caught one issue.  If you guys are in favour I can remove the
> > WIP and ask that it be merged.
> >
> > On Thu, Jan 4, 2018 at 6:29 PM, Chris Olivier <cjolivier01@gmail.com>
> > wrote:
> >
> > > Nope, I have been on vacation.
> > >
> > > On Thu, Jan 4, 2018 at 9:10 AM, kellen sunderland <
> > > kellen.sunderland@gmail.com> wrote:
> > >
> > > > Hope everyone had a good break.  Just wanted to check if there were
> > > further
> > > > thoughts on OSX builds.  Chris, did you have time to look into
> > > virtualizing
> > > > Mac OS?  Would it make sense for us to put something in place in the
> > > > interim e.g. the clang solution?
> > > >
> > > > On Tue, Dec 12, 2017 at 7:59 PM, de Abreu, Marco <mabreu@amazon.com>
> > > > wrote:
> > > >
> > > > > Thanks for looking into this, Chris! No hurries on that one, we’ll
> > look
> > > > > into it next stage when we add new system- and build-configurations
> > to
> > > > the
> > > > > CI.
> > > > >
> > > > > On 12.12.17, 19:12, "Chris Olivier" <cjolivier01@gmail.com>
wrote:
> > > > >
> > > > >     I am on vacation starting Thursday.
> > > > >
> > > > >     On Tue, Dec 12, 2017 at 9:49 AM kellen sunderland <
> > > > >     kellen.sunderland@gmail.com> wrote:
> > > > >
> > > > >     > Absolutely, let's do an investigation and see if it's
> possible
> > to
> > > > >     > virtualize.  Would you have time to look into it a bit
> further?
> > > > >     >
> > > > >     > On Tue, Dec 12, 2017 at 6:47 PM, Chris Olivier <
> > > > > cjolivier01@gmail.com>
> > > > >     > wrote:
> > > > >     >
> > > > >     > > Don’t get me wrong, I’m not saying this Mac OS
Jenkins
> > solution
> > > > is
> > > > > doable
> > > > >     > > but I feel like we should investigate because the payoff
> > would
> > > be
> > > > > large.
> > > > >     > >
> > > > >     > >
> > > > >     > > On Tue, Dec 12, 2017 at 9:38 AM Chris Olivier <
> > > > > cjolivier01@gmail.com>
> > > > >     > > wrote:
> > > > >     > >
> > > > >     > > > Apple’s Darwin OS Is recently open-sourced.
> > > > >     > > > https://github.com/PureDarwin/PureDarwin
> > > > >     > > >
> > > > >     > > > How to convert this into a non-GUI VM I am not
sure but I
> > am
> > > > > willing to
> > > > >     > > > bet that people have done it already.
> > > > >     > > >
> > > > >     > > > On Tue, Dec 12, 2017 at 9:16 AM kellen sunderland
<
> > > > >     > > > kellen.sunderland@gmail.com> wrote:
> > > > >     > > >
> > > > >     > > >> It might be technically possible, but I think
it would
> > > violate
> > > > > the
> > > > >     > MacOS
> > > > >     > > >> license: http://store.apple.com/
> > > Catalog/US/Images/MacOSX.htm
> > > > >     > > >>
> > > > >     > > >> "2. Permitted License Uses and Restrictions.
> > > > >     > > >> A. This License allows you to install and
use one copy
> of
> > > the
> > > > > Apple
> > > > >     > > >> Software on a single Apple-labeled computer
at a time.
> > This
> > > > > License
> > > > >     > does
> > > > >     > > >> not allow the Apple Software to exist on more
than one
> > > > computer
> > > > > at a
> > > > >     > > >> time,and you may not make the Apple Software
available
> > over
> > > a
> > > > > network
> > > > >     > > >> where
> > > > >     > > >> it could be used by multiple computers at
the same time.
> > You
> > > > > may make
> > > > >     > > one
> > > > >     > > >> copy of the Apple Software (excluding the
Boot ROM code)
> > in
> > > > >     > > >> machine-readable form for backup purposes
only; provided
> > > that
> > > > > the
> > > > >     > backup
> > > > >     > > >> copy must include all copyright or other proprietary
> > notices
> > > > > contained
> > > > >     > > on
> > > > >     > > >> the original. "
> > > > >     > > >>
> > > > >     > > >> I could be wrong though, does anyone know
the details of
> > > MacOS
> > > > >     > > licensing /
> > > > >     > > >> virtualization?
> > > > >     > > >>
> > > > >     > > >> On Tue, Dec 12, 2017 at 6:10 PM, Chris Olivier
<
> > > > > cjolivier01@gmail.com
> > > > >     > >
> > > > >     > > >> wrote:
> > > > >     > > >>
> > > > >     > > >> > googling seems to be full of running
OSX (and even
> > > > > open-sourced
> > > > >     > > >> PureDarwin)
> > > > >     > > >> > in VMs. One could conceivably run a VM
on an EC2
> > instance,
> > > > > right?
> > > > >     > > >> >
> > > > >     > > >> > On Tue, Dec 12, 2017 at 9:01 AM kellen
sunderland <
> > > > >     > > >> > kellen.sunderland@gmail.com> wrote:
> > > > >     > > >> >
> > > > >     > > >> > > It would be ideal if we could cover
OSX in Jenkins,
> > but
> > > > the
> > > > > only
> > > > >     > > >> solution
> > > > >     > > >> > > that I'm aware of would require
physical machines to
> > be
> > > > the
> > > > >     > workers.
> > > > >     > > >> I
> > > > >     > > >> > > would be weakly opposed to having
physical servers
> > > running
> > > > > on PRs.
> > > > >     > > >> The
> > > > >     > > >> > > downsides that I see in order of
importance:
> > > > >     > > >> > >
> > > > >     > > >> > > -  We can't autoscale physical hardware.
  If we
> find
> > > that
> > > > > the
> > > > >     > load
> > > > >     > > is
> > > > >     > > >> > too
> > > > >     > > >> > > high we have to buy more machines.
> > > > >     > > >> > > -  Security would be tricky, as
they'd have to be
> > > > connected
> > > > > to the
> > > > >     > > >> > internet
> > > > >     > > >> > > and then to our Jekins master instance.
 Connecting
> > via
> > > a
> > > > > wired
> > > > >     > > >> network
> > > > >     > > >> > > would probably not be possible on
most corporate
> > > networks
> > > > > as these
> > > > >     > > >> > machines
> > > > >     > > >> > > are by definition running arbitrary
code from the
> > > > > internet.  Many
> > > > >     > > >> > corporate
> > > > >     > > >> > > sites have public wifi that this
machine could
> > > potentially
> > > > > connect
> > > > >     > > to,
> > > > >     > > >> > but
> > > > >     > > >> > > then our PRs start failing if the
wifi disconnects
> > > > > temporarily.
> > > > >     > To
> > > > >     > > >> > connect
> > > > >     > > >> > > to the master we would need to setup
a vpn solution
> > with
> > > > > endpoints
> > > > >     > > in
> > > > >     > > >> our
> > > > >     > > >> > > vpc on AWS.  This is possible but
would probably
> > > require a
> > > > > lot of
> > > > >     > > >> > security
> > > > >     > > >> > > work.
> > > > >     > > >> > > -  We can't just create a simple
startup script or
> > yaml
> > > > > file that
> > > > >     > is
> > > > >     > > >> > > checked into GitHub to manage the
machine.  Someone
> > will
> > > > > actually
> > > > >     > > >> have to
> > > > >     > > >> > > physically administer the machine,
apply updates,
> etc.
> > > > > which will
> > > > >     > > make
> > > > >     > > >> > > community ownership difficult.
> > > > >     > > >> > >
> > > > >     > > >> > > Specific to an OSX build:
> > > > >     > > >> > > -  We can't virtualize OSX which
means we'd only be
> > able
> > > > to
> > > > > cover
> > > > >     > > one
> > > > >     > > >> OSX
> > > > >     > > >> > > build environment per physical device.
 We couldn't
> > > > target a
> > > > >     > matrix
> > > > >     > > of
> > > > >     > > >> > OSX
> > > > >     > > >> > > and Xcode versions as in Travis.
> > > > >     > > >> > >
> > > > >     > > >> > > -Kellen
> > > > >     > > >> > >
> > > > >     > > >> > > On Tue, Dec 12, 2017 at 5:46 PM,
Chris Olivier <
> > > > >     > > cjolivier01@gmail.com
> > > > >     > > >> >
> > > > >     > > >> > > wrote:
> > > > >     > > >> > >
> > > > >     > > >> > > > So why Travis when we could
possibly use Jenkins?
> > > > >     > > >> > > >
> > > > >     > > >> > > > On Tue, Dec 12, 2017 at 7:59
AM Marco de Abreu <
> > > > >     > > >> > > > marco.g.abreu@googlemail.com>
> > > > >     > > >> > > > wrote:
> > > > >     > > >> > > >
> > > > >     > > >> > > > > Yes that's correct, Chris.
> > > > >     > > >> > > > >
> > > > >     > > >> > > > > Am 12.12.2017 4:46 nachm.
schrieb "Chris
> Olivier"
> > <
> > > > >     > > >> > > cjolivier01@gmail.com
> > > > >     > > >> > > > >:
> > > > >     > > >> > > > >
> > > > >     > > >> > > > > > A quick google search
seems to indicate that
> Mac
> > > can
> > > > > be used
> > > > >     > > as
> > > > >     > > >> a
> > > > >     > > >> > > > Jenkins
> > > > >     > > >> > > > > > slave. Is this correct?
> > > > >     > > >> > > > > >
> > > > >     > > >> > > > > > On Tue, Dec 12, 2017
at 7:42 AM Steffen
> Rochel <
> > > > >     > > >> > > > steffenrochel@gmail.com>
> > > > >     > > >> > > > > > wrote:
> > > > >     > > >> > > > > >
> > > > >     > > >> > > > > > > +1 for #1 and
#2
> > > > >     > > >> > > > > > >
> > > > >     > > >> > > > > > > I’m working
on getting a MacPro to add to CI
> > > > system.
> > > > >     > > >> > > > > > > On Tue, Dec
12, 2017 at 1:43 AM kellen
> > > sunderland
> > > > <
> > > > >     > > >> > > > > > > kellen.sunderland@gmail.com>
wrote:
> > > > >     > > >> > > > > > >
> > > > >     > > >> > > > > > > > Background:
 TravisCI is a startup
> providing
> > > > > managed
> > > > >     > > >> continuous
> > > > >     > > >> > > > > > > > integration
services with GitHub
> integration
> > > and
> > > > > YAML
> > > > >     > > based
> > > > >     > > >> > > > > > > configuration.
> > > > >     > > >> > > > > > > > TravisCI
is one of the few CI providers
> that
> > > > will
> > > > > build
> > > > >     > a
> > > > >     > > >> > variety
> > > > >     > > >> > > > of
> > > > >     > > >> > > > > > > > OSX/MacOS
builds for software projects.
> > Their
> > > > > pricing
> > > > >     > > >> ranges
> > > > >     > > >> > > from
> > > > >     > > >> > > > > Free
> > > > >     > > >> > > > > > > > (for open
source, 1 concurrent job, to
> $489
> > > > > monthly for
> > > > >     > 10
> > > > >     > > >> > > > concurrent
> > > > >     > > >> > > > > > > jobs).
> > > > >     > > >> > > > > > > >
> > > > >     > > >> > > > > > > > Problem:
We’ve had a few OSX build issues
> > slip
> > > > > into
> > > > >     > MXNet
> > > > >     > > >> > master
> > > > >     > > >> > > in
> > > > >     > > >> > > > > the
> > > > >     > > >> > > > > > > > past few
weeks.  We’ve previously had a
> > Travis
> > > > CI
> > > > > based
> > > > >     > > >> testing
> > > > >     > > >> > > > > system
> > > > >     > > >> > > > > > > that
> > > > >     > > >> > > > > > > > would have
caught these issues.
> > > > >     > > >> > > > > > > >
> > > > >     > > >> > > > > > > > Proposals
so far:
> > > > >     > > >> > > > > > > >
> > > > >     > > >> > > > > > > > 1) Use
TravisCI in it’s free mode for a
> very
> > > > > minimal
> > > > >     > > sanity
> > > > >     > > >> > check
> > > > >     > > >> > > > on
> > > > >     > > >> > > > > > OSX.
> > > > >     > > >> > > > > > > > If we compile
the program, and for example
> > run
> > > > > C++ unit
> > > > >     > > >> tests
> > > > >     > > >> > > we’re
> > > > >     > > >> > > > > > > > unlikely
to run into problems with queued
> > > > > builds.  The
> > > > >     > > total
> > > > >     > > >> > > build
> > > > >     > > >> > > > > time
> > > > >     > > >> > > > > > > > here should
be less than 15 minutes.
> > > > > Configuration
> > > > >     > should
> > > > >     > > >> be
> > > > >     > > >> > > quite
> > > > >     > > >> > > > > > > simple
> > > > >     > > >> > > > > > > > and easy
to maintain.  Error messages
> should
> > > > also
> > > > > be
> > > > >     > > >> obvious to
> > > > >     > > >> > > > > > > > contributors.
> > > > >     > > >> > > > > > > > 2) Run
clang in Linux with our current CI.
> > > > > Building
> > > > >     > with
> > > > >     > > >> clang
> > > > >     > > >> > > > > should
> > > > >     > > >> > > > > > > > take less
than 10 minutes, should flush
> out
> > a
> > > > > large
> > > > >     > subset
> > > > >     > > >> of
> > > > >     > > >> > the
> > > > >     > > >> > > > > > issues
> > > > >     > > >> > > > > > > > we’ve
seen with OSX, and be quite easy to
> > > > > maintain.
> > > > >     > > >> > > > > > > > 3) Run
full test-suites in TravisCI,
> > equaling
> > > > the
> > > > > level
> > > > >     > of
> > > > >     > > >> > > coverage
> > > > >     > > >> > > > > we
> > > > >     > > >> > > > > > > > provide
to Linux in Jenkins.  This could
> > > require
> > > > > us to
> > > > >     > > >> > subscribe
> > > > >     > > >> > > > to a
> > > > >     > > >> > > > > > > > monthly
package with Travis to ensure our
> > > build
> > > > > queue
> > > > >     > > >> doesn’t
> > > > >     > > >> > > grow
> > > > >     > > >> > > > to
> > > > >     > > >> > > > > > an
> > > > >     > > >> > > > > > > > unacceptable
length.  It may also require
> a
> > > > > volunteer to
> > > > >     > > >> setup
> > > > >     > > >> > > and
> > > > >     > > >> > > > > > > maintain
> > > > >     > > >> > > > > > > > long-term.
> > > > >     > > >> > > > > > > >
> > > > >     > > >> > > > > > > > I’d +1
#1 and #2 as I think those should
> be
> > > > > low-cost,
> > > > >     > > >> > > low-maintence
> > > > >     > > >> > > > > > > > solutions
that should catch the majority
> of
> > > the
> > > > > problems
> > > > >     > > >> we’ve
> > > > >     > > >> > > seen
> > > > >     > > >> > > > > > thus
> > > > >     > > >> > > > > > > > far.
> > > > >     > > >> > > > > > > >
> > > > >     > > >> > > > > > > > -Kellen
> > > > >     > > >> > > > > > > >
> > > > >     > > >> > > > > > >
> > > > >     > > >> > > > > >
> > > > >     > > >> > > > >
> > > > >     > > >> > > >
> > > > >     > > >> > >
> > > > >     > > >> >
> > > > >     > > >>
> > > > >     > > >
> > > > >     > >
> > > > >     >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message