mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kellen sunderland <kellen.sunderl...@gmail.com>
Subject Re: [DISCUSS] Build OSX builds in CI (possibly with TravisCI).
Date Wed, 05 Sep 2018 16:49:15 GMT
Great you feel that way Lin, please feel free to contribute if you have any
features you'd like tested.  We are using the travis image xcode9.4 which
is based on MacOS 10.13.

On Wed, Sep 5, 2018 at 6:40 PM Lin Yuan <apeforest@gmail.com> wrote:

> Hi Kellen,
>
> Many thanks for your and Marco's effort! I think this is a very crucial
> piece to improve MXNet stability.
>
> To add some data points:
> 1) Customers using CoreML to MXNet converter were blocked for a while
> because the converter was broken and no unit test was in place to detect
> that.
> 2) Developers on Mac cannot verify their local commits because some unit
> tests on master were broken. This wasted much time and resource on jenkins
> server to detect the failure.
> 3) Please consider running the CI on Mac OS 10.13 since this is the minimum
> Mac OS version that supports CoreML (to support CoreML to MXNet converter)
>
> Best Regards,
>
> Lin
>
> On Wed, Sep 5, 2018, 3:02 AM kellen sunderland <
> kellen.sunderland@gmail.com>
> wrote:
>
> > I'm bumping this thread as we've recently had our first serious bug on
> > MacOS that would have been caught by enabling Travis.
> >
> > I'm going to do a little experimental work together with Marco with the
> > goal of enabling a minimal Travis build that will run python tests.  So
> far
> > I've verified that Travis will in fact find a bug that currently exists
> in
> > master and has been reproduced by MacOS clients.  This indicates to me
> that
> > adding Travis will add value to our CI.
> >
> > My best guess is that it might take us some iteration before we find a
> > scalable way to integrate Travis.  Given this we're going to enable
> Travis
> > in non-blocking mode (i.e. failures are safe to ignore for the time
> being).
> >
> > To help mitigate the risk of timeouts, and to remove legacy code I'm
> going
> > to re-create the travis.yml file from scratch.  I think it'll be much
> less
> > confusing if we only have working code related to Travis in our codebase,
> > so that contributors won't have to experiment to see what is or isn't
> > working.  We've got some great, but slightly out-of-date functionality in
> > the legacy .travis.yml file.  I hope we can work together to update the
> > legacy features, ensure they work with the current folder structure and
> > also make sure the features run within Travis's 45 minute global time
> > window.
> >
> > I'd also like to set expectations that this is strictly a volunteer
> > effort.  I'd welcome help from the community for support and maintenance.
> > The model downloading caching work particularly stands out to me as
> > something I'd like to re-enable again as soon as possible.
> >
> > -Kellen
> >
> > On Tue, Jan 9, 2018 at 11:52 AM Marco de Abreu <
> > marco.g.abreu@googlemail.com>
> > wrote:
> >
> > > Looks good! +1
> > >
> > > On Tue, Jan 9, 2018 at 10:24 AM, kellen sunderland <
> > > kellen.sunderland@gmail.com> wrote:
> > >
> > > > I think most were in favour of at a minimum creating a clang build so
> > > I've
> > > > created a PR
> > > > https://github.com/apache/incubator-mxnet/pull/9330/commits/
> > > > 84089ea14123ebe4d66cc92e82a2d529cfbd8b19.
> > > > My hope is this will catch many of the issues blocking OSX builds.
> In
> > > fact
> > > > it already caught one issue.  If you guys are in favour I can remove
> > the
> > > > WIP and ask that it be merged.
> > > >
> > > > On Thu, Jan 4, 2018 at 6:29 PM, Chris Olivier <cjolivier01@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Nope, I have been on vacation.
> > > > >
> > > > > On Thu, Jan 4, 2018 at 9:10 AM, kellen sunderland <
> > > > > kellen.sunderland@gmail.com> wrote:
> > > > >
> > > > > > Hope everyone had a good break.  Just wanted to check if there
> were
> > > > > further
> > > > > > thoughts on OSX builds.  Chris, did you have time to look into
> > > > > virtualizing
> > > > > > Mac OS?  Would it make sense for us to put something in place
in
> > the
> > > > > > interim e.g. the clang solution?
> > > > > >
> > > > > > On Tue, Dec 12, 2017 at 7:59 PM, de Abreu, Marco <
> > mabreu@amazon.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for looking into this, Chris! No hurries on that
one,
> > we’ll
> > > > look
> > > > > > > into it next stage when we add new system- and
> > build-configurations
> > > > to
> > > > > > the
> > > > > > > CI.
> > > > > > >
> > > > > > > On 12.12.17, 19:12, "Chris Olivier" <cjolivier01@gmail.com>
> > wrote:
> > > > > > >
> > > > > > >     I am on vacation starting Thursday.
> > > > > > >
> > > > > > >     On Tue, Dec 12, 2017 at 9:49 AM kellen sunderland <
> > > > > > >     kellen.sunderland@gmail.com> wrote:
> > > > > > >
> > > > > > >     > Absolutely, let's do an investigation and see
if it's
> > > possible
> > > > to
> > > > > > >     > virtualize.  Would you have time to look into
it a bit
> > > further?
> > > > > > >     >
> > > > > > >     > On Tue, Dec 12, 2017 at 6:47 PM, Chris Olivier
<
> > > > > > > cjolivier01@gmail.com>
> > > > > > >     > wrote:
> > > > > > >     >
> > > > > > >     > > Don’t get me wrong, I’m not saying this
Mac OS Jenkins
> > > > solution
> > > > > > is
> > > > > > > doable
> > > > > > >     > > but I feel like we should investigate because
the
> payoff
> > > > would
> > > > > be
> > > > > > > large.
> > > > > > >     > >
> > > > > > >     > >
> > > > > > >     > > On Tue, Dec 12, 2017 at 9:38 AM Chris Olivier
<
> > > > > > > cjolivier01@gmail.com>
> > > > > > >     > > wrote:
> > > > > > >     > >
> > > > > > >     > > > Apple’s Darwin OS Is recently open-sourced.
> > > > > > >     > > > https://github.com/PureDarwin/PureDarwin
> > > > > > >     > > >
> > > > > > >     > > > How to convert this into a non-GUI VM
I am not sure
> > but I
> > > > am
> > > > > > > willing to
> > > > > > >     > > > bet that people have done it already.
> > > > > > >     > > >
> > > > > > >     > > > On Tue, Dec 12, 2017 at 9:16 AM kellen
sunderland <
> > > > > > >     > > > kellen.sunderland@gmail.com> wrote:
> > > > > > >     > > >
> > > > > > >     > > >> It might be technically possible,
but I think it
> would
> > > > > violate
> > > > > > > the
> > > > > > >     > MacOS
> > > > > > >     > > >> license: http://store.apple.com/
> > > > > Catalog/US/Images/MacOSX.htm
> > > > > > >     > > >>
> > > > > > >     > > >> "2. Permitted License Uses and Restrictions.
> > > > > > >     > > >> A. This License allows you to install
and use one
> copy
> > > of
> > > > > the
> > > > > > > Apple
> > > > > > >     > > >> Software on a single Apple-labeled
computer at a
> time.
> > > > This
> > > > > > > License
> > > > > > >     > does
> > > > > > >     > > >> not allow the Apple Software to
exist on more than
> one
> > > > > > computer
> > > > > > > at a
> > > > > > >     > > >> time,and you may not make the Apple
Software
> available
> > > > over
> > > > > a
> > > > > > > network
> > > > > > >     > > >> where
> > > > > > >     > > >> it could be used by multiple computers
at the same
> > time.
> > > > You
> > > > > > > may make
> > > > > > >     > > one
> > > > > > >     > > >> copy of the Apple Software (excluding
the Boot ROM
> > code)
> > > > in
> > > > > > >     > > >> machine-readable form for backup
purposes only;
> > provided
> > > > > that
> > > > > > > the
> > > > > > >     > backup
> > > > > > >     > > >> copy must include all copyright
or other proprietary
> > > > notices
> > > > > > > contained
> > > > > > >     > > on
> > > > > > >     > > >> the original. "
> > > > > > >     > > >>
> > > > > > >     > > >> I could be wrong though, does anyone
know the
> details
> > of
> > > > > MacOS
> > > > > > >     > > licensing /
> > > > > > >     > > >> virtualization?
> > > > > > >     > > >>
> > > > > > >     > > >> On Tue, Dec 12, 2017 at 6:10 PM,
Chris Olivier <
> > > > > > > cjolivier01@gmail.com
> > > > > > >     > >
> > > > > > >     > > >> wrote:
> > > > > > >     > > >>
> > > > > > >     > > >> > googling seems to be full of
running OSX (and even
> > > > > > > open-sourced
> > > > > > >     > > >> PureDarwin)
> > > > > > >     > > >> > in VMs. One could conceivably
run a VM on an EC2
> > > > instance,
> > > > > > > right?
> > > > > > >     > > >> >
> > > > > > >     > > >> > On Tue, Dec 12, 2017 at 9:01
AM kellen sunderland
> <
> > > > > > >     > > >> > kellen.sunderland@gmail.com>
wrote:
> > > > > > >     > > >> >
> > > > > > >     > > >> > > It would be ideal if we
could cover OSX in
> > Jenkins,
> > > > but
> > > > > > the
> > > > > > > only
> > > > > > >     > > >> solution
> > > > > > >     > > >> > > that I'm aware of would
require physical
> machines
> > to
> > > > be
> > > > > > the
> > > > > > >     > workers.
> > > > > > >     > > >> I
> > > > > > >     > > >> > > would be weakly opposed
to having physical
> servers
> > > > > running
> > > > > > > on PRs.
> > > > > > >     > > >> The
> > > > > > >     > > >> > > downsides that I see in
order of importance:
> > > > > > >     > > >> > >
> > > > > > >     > > >> > > -  We can't autoscale
physical hardware.   If we
> > > find
> > > > > that
> > > > > > > the
> > > > > > >     > load
> > > > > > >     > > is
> > > > > > >     > > >> > too
> > > > > > >     > > >> > > high we have to buy more
machines.
> > > > > > >     > > >> > > -  Security would be tricky,
as they'd have to
> be
> > > > > > connected
> > > > > > > to the
> > > > > > >     > > >> > internet
> > > > > > >     > > >> > > and then to our Jekins
master instance.
> > Connecting
> > > > via
> > > > > a
> > > > > > > wired
> > > > > > >     > > >> network
> > > > > > >     > > >> > > would probably not be
possible on most corporate
> > > > > networks
> > > > > > > as these
> > > > > > >     > > >> > machines
> > > > > > >     > > >> > > are by definition running
arbitrary code from
> the
> > > > > > > internet.  Many
> > > > > > >     > > >> > corporate
> > > > > > >     > > >> > > sites have public wifi
that this machine could
> > > > > potentially
> > > > > > > connect
> > > > > > >     > > to,
> > > > > > >     > > >> > but
> > > > > > >     > > >> > > then our PRs start failing
if the wifi
> disconnects
> > > > > > > temporarily.
> > > > > > >     > To
> > > > > > >     > > >> > connect
> > > > > > >     > > >> > > to the master we would
need to setup a vpn
> > solution
> > > > with
> > > > > > > endpoints
> > > > > > >     > > in
> > > > > > >     > > >> our
> > > > > > >     > > >> > > vpc on AWS.  This is possible
but would probably
> > > > > require a
> > > > > > > lot of
> > > > > > >     > > >> > security
> > > > > > >     > > >> > > work.
> > > > > > >     > > >> > > -  We can't just create
a simple startup script
> or
> > > > yaml
> > > > > > > file that
> > > > > > >     > is
> > > > > > >     > > >> > > checked into GitHub to
manage the machine.
> > Someone
> > > > will
> > > > > > > actually
> > > > > > >     > > >> have to
> > > > > > >     > > >> > > physically administer
the machine, apply
> updates,
> > > etc.
> > > > > > > which will
> > > > > > >     > > make
> > > > > > >     > > >> > > community ownership difficult.
> > > > > > >     > > >> > >
> > > > > > >     > > >> > > Specific to an OSX build:
> > > > > > >     > > >> > > -  We can't virtualize
OSX which means we'd only
> > be
> > > > able
> > > > > > to
> > > > > > > cover
> > > > > > >     > > one
> > > > > > >     > > >> OSX
> > > > > > >     > > >> > > build environment per
physical device.  We
> > couldn't
> > > > > > target a
> > > > > > >     > matrix
> > > > > > >     > > of
> > > > > > >     > > >> > OSX
> > > > > > >     > > >> > > and Xcode versions as
in Travis.
> > > > > > >     > > >> > >
> > > > > > >     > > >> > > -Kellen
> > > > > > >     > > >> > >
> > > > > > >     > > >> > > On Tue, Dec 12, 2017 at
5:46 PM, Chris Olivier <
> > > > > > >     > > cjolivier01@gmail.com
> > > > > > >     > > >> >
> > > > > > >     > > >> > > wrote:
> > > > > > >     > > >> > >
> > > > > > >     > > >> > > > So why Travis when
we could possibly use
> > Jenkins?
> > > > > > >     > > >> > > >
> > > > > > >     > > >> > > > On Tue, Dec 12, 2017
at 7:59 AM Marco de
> Abreu <
> > > > > > >     > > >> > > > marco.g.abreu@googlemail.com>
> > > > > > >     > > >> > > > wrote:
> > > > > > >     > > >> > > >
> > > > > > >     > > >> > > > > Yes that's correct,
Chris.
> > > > > > >     > > >> > > > >
> > > > > > >     > > >> > > > > Am 12.12.2017
4:46 nachm. schrieb "Chris
> > > Olivier"
> > > > <
> > > > > > >     > > >> > > cjolivier01@gmail.com
> > > > > > >     > > >> > > > >:
> > > > > > >     > > >> > > > >
> > > > > > >     > > >> > > > > > A quick
google search seems to indicate
> that
> > > Mac
> > > > > can
> > > > > > > be used
> > > > > > >     > > as
> > > > > > >     > > >> a
> > > > > > >     > > >> > > > Jenkins
> > > > > > >     > > >> > > > > > slave.
Is this correct?
> > > > > > >     > > >> > > > > >
> > > > > > >     > > >> > > > > > On Tue,
Dec 12, 2017 at 7:42 AM Steffen
> > > Rochel <
> > > > > > >     > > >> > > > steffenrochel@gmail.com>
> > > > > > >     > > >> > > > > > wrote:
> > > > > > >     > > >> > > > > >
> > > > > > >     > > >> > > > > > > +1
for #1 and #2
> > > > > > >     > > >> > > > > > >
> > > > > > >     > > >> > > > > > > I’m
working on getting a MacPro to add
> to
> > CI
> > > > > > system.
> > > > > > >     > > >> > > > > > > On
Tue, Dec 12, 2017 at 1:43 AM kellen
> > > > > sunderland
> > > > > > <
> > > > > > >     > > >> > > > > > > kellen.sunderland@gmail.com>
wrote:
> > > > > > >     > > >> > > > > > >
> > > > > > >     > > >> > > > > > > >
Background:  TravisCI is a startup
> > > providing
> > > > > > > managed
> > > > > > >     > > >> continuous
> > > > > > >     > > >> > > > > > > >
integration services with GitHub
> > > integration
> > > > > and
> > > > > > > YAML
> > > > > > >     > > based
> > > > > > >     > > >> > > > > > > configuration.
> > > > > > >     > > >> > > > > > > >
TravisCI is one of the few CI
> providers
> > > that
> > > > > > will
> > > > > > > build
> > > > > > >     > a
> > > > > > >     > > >> > variety
> > > > > > >     > > >> > > > of
> > > > > > >     > > >> > > > > > > >
OSX/MacOS builds for software
> projects.
> > > > Their
> > > > > > > pricing
> > > > > > >     > > >> ranges
> > > > > > >     > > >> > > from
> > > > > > >     > > >> > > > > Free
> > > > > > >     > > >> > > > > > > >
(for open source, 1 concurrent job, to
> > > $489
> > > > > > > monthly for
> > > > > > >     > 10
> > > > > > >     > > >> > > > concurrent
> > > > > > >     > > >> > > > > > > jobs).
> > > > > > >     > > >> > > > > > > >
> > > > > > >     > > >> > > > > > > >
Problem: We’ve had a few OSX build
> > issues
> > > > slip
> > > > > > > into
> > > > > > >     > MXNet
> > > > > > >     > > >> > master
> > > > > > >     > > >> > > in
> > > > > > >     > > >> > > > > the
> > > > > > >     > > >> > > > > > > >
past few weeks.  We’ve previously had
> a
> > > > Travis
> > > > > > CI
> > > > > > > based
> > > > > > >     > > >> testing
> > > > > > >     > > >> > > > > system
> > > > > > >     > > >> > > > > > > that
> > > > > > >     > > >> > > > > > > >
would have caught these issues.
> > > > > > >     > > >> > > > > > > >
> > > > > > >     > > >> > > > > > > >
Proposals so far:
> > > > > > >     > > >> > > > > > > >
> > > > > > >     > > >> > > > > > > >
1) Use TravisCI in it’s free mode for
> a
> > > very
> > > > > > > minimal
> > > > > > >     > > sanity
> > > > > > >     > > >> > check
> > > > > > >     > > >> > > > on
> > > > > > >     > > >> > > > > > OSX.
> > > > > > >     > > >> > > > > > > >
If we compile the program, and for
> > example
> > > > run
> > > > > > > C++ unit
> > > > > > >     > > >> tests
> > > > > > >     > > >> > > we’re
> > > > > > >     > > >> > > > > > > >
unlikely to run into problems with
> > queued
> > > > > > > builds.  The
> > > > > > >     > > total
> > > > > > >     > > >> > > build
> > > > > > >     > > >> > > > > time
> > > > > > >     > > >> > > > > > > >
here should be less than 15 minutes.
> > > > > > > Configuration
> > > > > > >     > should
> > > > > > >     > > >> be
> > > > > > >     > > >> > > quite
> > > > > > >     > > >> > > > > > > simple
> > > > > > >     > > >> > > > > > > >
and easy to maintain.  Error messages
> > > should
> > > > > > also
> > > > > > > be
> > > > > > >     > > >> obvious to
> > > > > > >     > > >> > > > > > > >
contributors.
> > > > > > >     > > >> > > > > > > >
2) Run clang in Linux with our current
> > CI.
> > > > > > > Building
> > > > > > >     > with
> > > > > > >     > > >> clang
> > > > > > >     > > >> > > > > should
> > > > > > >     > > >> > > > > > > >
take less than 10 minutes, should
> flush
> > > out
> > > > a
> > > > > > > large
> > > > > > >     > subset
> > > > > > >     > > >> of
> > > > > > >     > > >> > the
> > > > > > >     > > >> > > > > > issues
> > > > > > >     > > >> > > > > > > >
we’ve seen with OSX, and be quite easy
> > to
> > > > > > > maintain.
> > > > > > >     > > >> > > > > > > >
3) Run full test-suites in TravisCI,
> > > > equaling
> > > > > > the
> > > > > > > level
> > > > > > >     > of
> > > > > > >     > > >> > > coverage
> > > > > > >     > > >> > > > > we
> > > > > > >     > > >> > > > > > > >
provide to Linux in Jenkins.  This
> could
> > > > > require
> > > > > > > us to
> > > > > > >     > > >> > subscribe
> > > > > > >     > > >> > > > to a
> > > > > > >     > > >> > > > > > > >
monthly package with Travis to ensure
> > our
> > > > > build
> > > > > > > queue
> > > > > > >     > > >> doesn’t
> > > > > > >     > > >> > > grow
> > > > > > >     > > >> > > > to
> > > > > > >     > > >> > > > > > an
> > > > > > >     > > >> > > > > > > >
unacceptable length.  It may also
> > require
> > > a
> > > > > > > volunteer to
> > > > > > >     > > >> setup
> > > > > > >     > > >> > > and
> > > > > > >     > > >> > > > > > > maintain
> > > > > > >     > > >> > > > > > > >
long-term.
> > > > > > >     > > >> > > > > > > >
> > > > > > >     > > >> > > > > > > >
I’d +1 #1 and #2 as I think those
> should
> > > be
> > > > > > > low-cost,
> > > > > > >     > > >> > > low-maintence
> > > > > > >     > > >> > > > > > > >
solutions that should catch the
> majority
> > > of
> > > > > the
> > > > > > > problems
> > > > > > >     > > >> we’ve
> > > > > > >     > > >> > > seen
> > > > > > >     > > >> > > > > > thus
> > > > > > >     > > >> > > > > > > >
far.
> > > > > > >     > > >> > > > > > > >
> > > > > > >     > > >> > > > > > > >
-Kellen
> > > > > > >     > > >> > > > > > > >
> > > > > > >     > > >> > > > > > >
> > > > > > >     > > >> > > > > >
> > > > > > >     > > >> > > > >
> > > > > > >     > > >> > > >
> > > > > > >     > > >> > >
> > > > > > >     > > >> >
> > > > > > >     > > >>
> > > > > > >     > > >
> > > > > > >     > >
> > > > > > >     >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message