mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kellen sunderland <kellen.sunderl...@gmail.com>
Subject Re: Time out for Travis CI
Date Tue, 02 Oct 2018 02:33:45 GMT
Still worth following up with Travis (I've already messaged them).  They're
in the middle of reorganizing their business model and merging paid and
free accounts into the same service, so maybe this policy is changing.  It
doesn't make a lot of sense to me that public repo accounts would have
timeout limits that are different to private repo accounts in cases where
they are both paid.

On Tue, Oct 2, 2018, 4:27 AM Marco de Abreu
<marco.g.abreu@googlemail.com.invalid> wrote:

> Apache has it's own shared Travis fleet. We are basically using an
> on-premise version of the paid Travis plan. That was the information I got
> from Infra when I had a chat with them a few days ago. But from that
> conversation it was made pretty clear that we cannot increase the limits.
>
> -Marco
>
> kellen sunderland <kellen.sunderland@gmail.com> schrieb am Di., 2. Okt.
> 2018, 03:25:
>
> > Interesting, this page seems to indicate that private projects do have a
> > longer time out.  I'll drop Travis a quick email and see what the deal
> > would be for our project.
> > https://docs.travis-ci.com/user/customizing-the-build/#build-timeouts.
> >
> > On Tue, Oct 2, 2018, 3:15 AM kellen sunderland <
> > kellen.sunderland@gmail.com>
> > wrote:
> >
> > > I actually thought we were already using a paid plan through Apache
> > > https://blogs.apache.org/infra/entry/apache_gains_additional_travis_ci
> > >
> > > On Tue, Oct 2, 2018, 3:11 AM Qing Lan <lanking520@live.com> wrote:
> > >
> > >> Are we currently on a free plan? If we are, probably the unlimited
> build
> > >> minutes would help
> > >>
> > >> Thanks,
> > >> Qing
> > >>
> > >> ´╗┐On 10/1/18, 6:08 PM, "kellen sunderland" <
> kellen.sunderland@gmail.com>
> > >> wrote:
> > >>
> > >>     Does the global time out change for paid plans?  I looked into it
> > >> briefly
> > >>     but didn't see anything that would indicate it does.
> > >>
> > >>     On Tue, Oct 2, 2018, 2:25 AM Pedro Larroy <
> > >> pedro.larroy.lists@gmail.com>
> > >>     wrote:
> > >>
> > >>     > I think there's two approaches that we can take to mitigate the
> > >> build &
> > >>     > test time problem, in one hand use a paid travis CI plan, in
> other
> > >> improve
> > >>     > the unit tests in suites and only run a core set of tests, as
we
> > >> should do
> > >>     > on devices, but on this case we reduce coverage.
> > >>     >
> > >>     > https://travis-ci.com/plans
> > >>     >
> > >>     > Pedro.
> > >>     >
> > >>     > On Sat, Sep 29, 2018 at 6:53 PM YiZhi Liu <eazhi.liu@gmail.com>
> > >> wrote:
> > >>     >
> > >>     > > This makes sense. Thanks
> > >>     > >
> > >>     > > On Sat, Sep 29, 2018 at 6:36 PM kellen sunderland <
> > >>     > > kellen.sunderland@gmail.com> wrote:
> > >>     > >
> > >>     > > > Hey Zhennan, yes this is the exact problem, and I agree
with
> > >> your
> > >>     > points
> > >>     > > > completely.  This is why when we first added Travis
we
> > >> attempted to
> > >>     > > > communicate that it would be informational only, and
that
> we'd
> > >> need to
> > >>     > > > iterate on the config before it would be a test that
people
> > >> should
> > >>     > > consider
> > >>     > > > 'required'.  Apologies, we should have been more
> > >> straightforward about
> > >>     > > > those tradeoffs.  The strong point in favour of adding
> Travis
> > in
> > >>     > > > informational mode was that we had a serious MacOS specific
> > bug
> > >> that we
> > >>     > > > wanted to verify was fixed.
> > >>     > > >
> > >>     > > > The good news is I've opened a PR which I hope will
speed up
> > >> these
> > >>     > builds
> > >>     > > > to the point that they won't rely on caching.  Once
it is
> > >> merged it
> > >>     > would
> > >>     > > > be very helpful if you could rebase on this PR and test
to
> > >> ensure that
> > >>     > > > large changes no longer hit the global timeout without
> cache.
> > >>     > > > https://github.com/apache/incubator-mxnet/pull/12706
> > >>     > > >
> > >>     > > > On Sun, Sep 30, 2018 at 2:48 AM Qin, Zhennan <
> > >> zhennan.qin@intel.com>
> > >>     > > > wrote:
> > >>     > > >
> > >>     > > > > Hi YiZhi and Kellen,
> > >>     > > > >
> > >>     > > > > From my point of view, travis should be able to
get passed
> > >> from a
> > >>     > > scratch
> > >>     > > > > build. Pending result on ccache hit/miss is not
a good
> idea.
> > >> For this
> > >>     > > PR,
> > >>     > > > > as it changed many header file, lots of files need
be
> > >> recompiled,
> > >>     > just
> > >>     > > > like
> > >>     > > > > a scratch build. I think that's the reason that
travis
> > >> timeout. This
> > >>     > > > should
> > >>     > > > > be fixed before enabling travis, as it will block
any
> change
> > >> to those
> > >>     > > > base
> > >>     > > > > header file. Again, it's not a special case with
this PR
> > >> only, you
> > >>     > can
> > >>     > > > find
> > >>     > > > > same problem on other PRs:
> > >>     > > > >
> > >>     > > > >
> > >>     > > > >
> > >>     > > >
> > >>     > >
> > >>     >
> > >>
> >
> https://travis-ci.org/apache/incubator-mxnet/builds/433172088?utm_source=github_status&utm_medium=notification
> > >>     > > > >
> > >>     > > > >
> > >>     > > >
> > >>     > >
> > >>     >
> > >>
> >
> https://travis-ci.org/apache/incubator-mxnet/builds/434404305?utm_source=github_status&utm_medium=notification
> > >>     > > > >
> > >>     > > > >
> > >>     > > > > Thanks,
> > >>     > > > > Zhennan
> > >>     > > > >
> > >>     > > > > -----Original Message-----
> > >>     > > > > From: YiZhi Liu [mailto:eazhi.liu@gmail.com]
> > >>     > > > > Sent: Sunday, September 30, 2018 5:15 AM
> > >>     > > > > To: eazhi.liu@gmail.com
> > >>     > > > > Cc: dev@mxnet.incubator.apache.org
> > >>     > > > > Subject: Re: Time out for Travis CI
> > >>     > > > >
> > >>     > > > > while other PRs are all good.
> > >>     > > > > On Sat, Sep 29, 2018 at 2:13 PM YiZhi Liu <
> > >> eazhi.liu@gmail.com>
> > >>     > wrote:
> > >>     > > > > >
> > >>     > > > > > Honestly I don't know yet. I can help to investigate.
> Just
> > >> given
> > >>     > the
> > >>     > > > > > evidence that, travis timeout every time it
gets
> > >> re-triggered - 2
> > >>     > > > > > times at least. Correct me if I'm wrong @
Zhennan On
> Sat,
> > >> Sep 29,
> > >>     > > 2018
> > >>     > > > > > at 1:54 PM kellen sunderland <
> kellen.sunderland@gmail.com
> > >
> > >> wrote:
> > >>     > > > > > >
> > >>     > > > > > > Reading over the PR I don't see what
aspects would
> cause
> > >> extra
> > >>     > > > > > > runtime YiZhi, could you point them out?
> > >>     > > > > > >
> > >>     > > > > > > On Sat, Sep 29, 2018 at 8:46 PM YiZhi
Liu <
> > >> eazhi.liu@gmail.com>
> > >>     > > > wrote:
> > >>     > > > > > >
> > >>     > > > > > > > Kellen, I think this PR introduces
extra runtime in
> > CI,
> > >> thus
> > >>     > > > > > > > causes the timeout. Which means,
once merged, every
> PR
> > >> later
> > >>     > will
> > >>     > > > > > > > see same timeout in travis.
> > >>     > > > > > > >
> > >>     > > > > > > > So shall we modify the changes to
decrease the test
> > >> running
> > >>     > time?
> > >>     > > > > > > > or just disable the Travis CI?
> > >>     > > > > > > >
> > >>     > > > > > > >
> > >>     > > > > > > > On Fri, Sep 28, 2018 at 9:17 PM
Qin, Zhennan
> > >>     > > > > > > > <zhennan.qin@intel.com>
> > >>     > > > > > > > wrote:
> > >>     > > > > > > > >
> > >>     > > > > > > > > Hi Kellen,
> > >>     > > > > > > > >
> > >>     > > > > > > > > Thanks for your explanation.
Do you have a time
> plan
> > >> to solve
> > >>     > > > > > > > > the
> > >>     > > > > > > > timeout issue? Rebasing can't work
for my case. Or
> > >> shall we run
> > >>     > > it
> > >>     > > > > > > > silently to disallow it voting X
for overall CI
> > result?
> > >> Because
> > >>     > > > > > > > most developers are used to ignore
the PRs with 'X'.
> > >>     > > > > > > > >
> > >>     > > > > > > > > Thanks,
> > >>     > > > > > > > > Zhennan
> > >>     > > > > > > > >
> > >>     > > > > > > > > -----Original Message-----
> > >>     > > > > > > > > From: kellen sunderland [mailto:
> > >> kellen.sunderland@gmail.com]
> > >>     > > > > > > > > Sent: Friday, September 28,
2018 10:38 PM
> > >>     > > > > > > > > To: dev@mxnet.incubator.apache.org
> > >>     > > > > > > > > Subject: Re: Time out for Travis
CI
> > >>     > > > > > > > >
> > >>     > > > > > > > > Hey Zhennan, you're safe to
ignore Travis failures
> > >> for now.
> > >>     > > > > > > > > They're
> > >>     > > > > > > > just informational.
> > >>     > > > > > > > >
> > >>     > > > > > > > > The reason you sometimes see
quick builds and
> > >> sometimes see
> > >>     > > slow
> > >>     > > > > > > > > builds
> > >>     > > > > > > > is that we're making use of ccache
in between
> builds.
> > >> If your
> > >>     > PR
> > >>     > > > > > > > is similar to what's in master you
should build very
> > >> quickly,
> > >>     > if
> > >>     > > > > > > > not it's going to take a while and
likely time out.
> > If
> > >> you see
> > >>     > > > > > > > timeouts rebasing may speed things
up.
> Unfortunately
> > >> the
> > >>     > > timeouts
> > >>     > > > > > > > are global and we're not able to
increase them.  I'm
> > >> hoping
> > >>     > that
> > >>     > > > > > > > adding artifact caching will speed
up future builds
> to
> > >> the
> > >>     > point
> > >>     > > > > > > > that test runs and builds can be
executed in under
> the
> > >> global
> > >>     > > limit
> > >>     > > > > (which is ~50 minutes).
> > >>     > > > > > > > >
> > >>     > > > > > > > > -Kellen
> > >>     > > > > > > > >
> > >>     > > > > > > > >
> > >>     > > > > > > > > On Fri, Sep 28, 2018 at 4:05
PM Qin, Zhennan
> > >>     > > > > > > > > <zhennan.qin@intel.com>
> > >>     > > > > > > > wrote:
> > >>     > > > > > > > >
> > >>     > > > > > > > > > Hi MXNet devs,
> > >>     > > > > > > > > >
> > >>     > > > > > > > > > I'm struggled with new
Travis CI for a while, it
> > >> always run
> > >>     > > > > > > > > > time out for this PR:
> > >>     > > > > > > > > >
> > >> https://github.com/apache/incubator-mxnet/pull/12530
> > >>     > > > > > > > > >
> > >>     > > > > > > > > > Most of the time, Jenkins
CI can pass, while
> > Travis
> > >> can't
> > >>     > be
> > >>     > > > > > > > > > finished within 50 minutes.
For this PR, it
> > >> shouldn't
> > >>     > affect
> > >>     > > > > > > > > > much on the build time
or unit test time. Also,
> I
> > >> saw other
> > >>     > > PR
> > >>     > > > > has same problem, eg.
> > >>     > > > > > > > > >
> > >>     > > > > > > > > >
> > >>     > > > > > > > > >
> > >>     > > https://travis-ci.org/apache/incubator-mxnet/builds/433172088
> ?
> > >>     > > > > > > > > > utm_sour
> ce=github_status&utm_medium=notification
> > >>     > > > > > > > > >
> > >>     > > > > > > > > >
> > >>     > > https://travis-ci.org/apache/incubator-mxnet/builds/434404305
> ?
> > >>     > > > > > > > > > utm_sour
> ce=github_status&utm_medium=notification
> > >>     > > > > > > > > >
> > >>     > > > > > > > > > According to the time
stamp from Travis, all
> > passed
> > >> PR are
> > >>     > > > > > > > > > within small code change,
and can complete `make
> > >> -j2`
> > >>     > within
> > >>     > > > > > > > > > 25s. But for timeout case,
'make -j2' will need
> > >> about
> > >>     > 1600s.
> > >>     > > > > > > > > > Does Travis do incremental
build for each test?
> > >> Shall we
> > >>     > > > > > > > > > increase time limit for
large PR? Can we add
> more
> > >> time
> > >>     > stamp
> > >>     > > > > > > > > > for build and unites stage
to
> > >>     > > > > > > > help understand what's going on
there?
> > >>     > > > > > > > > >
> > >>     > > > > > > > > > Thanks in advance,
> > >>     > > > > > > > > > Zhennan
> > >>     > > > > > > > > >
> > >>     > > > > > > >
> > >>     > > > > > > >
> > >>     > > > > > > >
> > >>     > > > > > > > --
> > >>     > > > > > > > Yizhi Liu
> > >>     > > > > > > > DMLC member
> > >>     > > > > > > > Amazon Web Services
> > >>     > > > > > > > Vancouver, Canada
> > >>     > > > > > > >
> > >>     > > > > >
> > >>     > > > > >
> > >>     > > > > >
> > >>     > > > > > --
> > >>     > > > > > Yizhi Liu
> > >>     > > > > > DMLC member
> > >>     > > > > > Amazon Web Services
> > >>     > > > > > Vancouver, Canada
> > >>     > > > >
> > >>     > > > >
> > >>     > > > >
> > >>     > > > > --
> > >>     > > > > Yizhi Liu
> > >>     > > > > DMLC member
> > >>     > > > > Amazon Web Services
> > >>     > > > > Vancouver, Canada
> > >>     > > > >
> > >>     > > >
> > >>     > > --
> > >>     > > Yizhi Liu
> > >>     > > DMLC member
> > >>     > > Amazon Web Services
> > >>     > > Vancouver, Canada
> > >>     > >
> > >>     >
> > >>
> > >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message