mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From YiZhi Liu <eazhi....@gmail.com>
Subject Re: Time out for Travis CI
Date Sun, 30 Sep 2018 01:53:33 GMT
This makes sense. Thanks

On Sat, Sep 29, 2018 at 6:36 PM kellen sunderland <
kellen.sunderland@gmail.com> wrote:

> Hey Zhennan, yes this is the exact problem, and I agree with your points
> completely.  This is why when we first added Travis we attempted to
> communicate that it would be informational only, and that we'd need to
> iterate on the config before it would be a test that people should consider
> 'required'.  Apologies, we should have been more straightforward about
> those tradeoffs.  The strong point in favour of adding Travis in
> informational mode was that we had a serious MacOS specific bug that we
> wanted to verify was fixed.
>
> The good news is I've opened a PR which I hope will speed up these builds
> to the point that they won't rely on caching.  Once it is merged it would
> be very helpful if you could rebase on this PR and test to ensure that
> large changes no longer hit the global timeout without cache.
> https://github.com/apache/incubator-mxnet/pull/12706
>
> On Sun, Sep 30, 2018 at 2:48 AM Qin, Zhennan <zhennan.qin@intel.com>
> wrote:
>
> > Hi YiZhi and Kellen,
> >
> > From my point of view, travis should be able to get passed from a scratch
> > build. Pending result on ccache hit/miss is not a good idea. For this PR,
> > as it changed many header file, lots of files need be recompiled, just
> like
> > a scratch build. I think that's the reason that travis timeout. This
> should
> > be fixed before enabling travis, as it will block any change to those
> base
> > header file. Again, it's not a special case with this PR only, you can
> find
> > same problem on other PRs:
> >
> >
> >
> https://travis-ci.org/apache/incubator-mxnet/builds/433172088?utm_source=github_status&utm_medium=notification
> >
> >
> https://travis-ci.org/apache/incubator-mxnet/builds/434404305?utm_source=github_status&utm_medium=notification
> >
> >
> > Thanks,
> > Zhennan
> >
> > -----Original Message-----
> > From: YiZhi Liu [mailto:eazhi.liu@gmail.com]
> > Sent: Sunday, September 30, 2018 5:15 AM
> > To: eazhi.liu@gmail.com
> > Cc: dev@mxnet.incubator.apache.org
> > Subject: Re: Time out for Travis CI
> >
> > while other PRs are all good.
> > On Sat, Sep 29, 2018 at 2:13 PM YiZhi Liu <eazhi.liu@gmail.com> wrote:
> > >
> > > Honestly I don't know yet. I can help to investigate. Just given the
> > > evidence that, travis timeout every time it gets re-triggered - 2
> > > times at least. Correct me if I'm wrong @ Zhennan On Sat, Sep 29, 2018
> > > at 1:54 PM kellen sunderland <kellen.sunderland@gmail.com> wrote:
> > > >
> > > > Reading over the PR I don't see what aspects would cause extra
> > > > runtime YiZhi, could you point them out?
> > > >
> > > > On Sat, Sep 29, 2018 at 8:46 PM YiZhi Liu <eazhi.liu@gmail.com>
> wrote:
> > > >
> > > > > Kellen, I think this PR introduces extra runtime in CI, thus
> > > > > causes the timeout. Which means, once merged, every PR later will
> > > > > see same timeout in travis.
> > > > >
> > > > > So shall we modify the changes to decrease the test running time?
> > > > > or just disable the Travis CI?
> > > > >
> > > > >
> > > > > On Fri, Sep 28, 2018 at 9:17 PM Qin, Zhennan
> > > > > <zhennan.qin@intel.com>
> > > > > wrote:
> > > > > >
> > > > > > Hi Kellen,
> > > > > >
> > > > > > Thanks for your explanation. Do you have a time plan to solve
> > > > > > the
> > > > > timeout issue? Rebasing can't work for my case. Or shall we run it
> > > > > silently to disallow it voting X for overall CI result? Because
> > > > > most developers are used to ignore the PRs with 'X'.
> > > > > >
> > > > > > Thanks,
> > > > > > Zhennan
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: kellen sunderland [mailto:kellen.sunderland@gmail.com]
> > > > > > Sent: Friday, September 28, 2018 10:38 PM
> > > > > > To: dev@mxnet.incubator.apache.org
> > > > > > Subject: Re: Time out for Travis CI
> > > > > >
> > > > > > Hey Zhennan, you're safe to ignore Travis failures for now.
> > > > > > They're
> > > > > just informational.
> > > > > >
> > > > > > The reason you sometimes see quick builds and sometimes see
slow
> > > > > > builds
> > > > > is that we're making use of ccache in between builds.  If your PR
> > > > > is similar to what's in master you should build very quickly, if
> > > > > not it's going to take a while and likely time out.  If you see
> > > > > timeouts rebasing may speed things up.  Unfortunately the timeouts
> > > > > are global and we're not able to increase them.  I'm hoping that
> > > > > adding artifact caching will speed up future builds to the point
> > > > > that test runs and builds can be executed in under the global limit
> > (which is ~50 minutes).
> > > > > >
> > > > > > -Kellen
> > > > > >
> > > > > >
> > > > > > On Fri, Sep 28, 2018 at 4:05 PM Qin, Zhennan
> > > > > > <zhennan.qin@intel.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi MXNet devs,
> > > > > > >
> > > > > > > I'm struggled with new Travis CI for a while, it always
run
> > > > > > > time out for this PR:
> > > > > > > https://github.com/apache/incubator-mxnet/pull/12530
> > > > > > >
> > > > > > > Most of the time, Jenkins CI can pass, while Travis can't
be
> > > > > > > finished within 50 minutes. For this PR, it shouldn't affect
> > > > > > > much on the build time or unit test time. Also, I saw other
PR
> > has same problem, eg.
> > > > > > >
> > > > > > >
> > > > > > > https://travis-ci.org/apache/incubator-mxnet/builds/433172088?
> > > > > > > utm_sour ce=github_status&utm_medium=notification
> > > > > > >
> > > > > > > https://travis-ci.org/apache/incubator-mxnet/builds/434404305?
> > > > > > > utm_sour ce=github_status&utm_medium=notification
> > > > > > >
> > > > > > > According to the time stamp from Travis, all passed PR
are
> > > > > > > within small code change, and can complete `make -j2` within
> > > > > > > 25s. But for timeout case, 'make -j2' will need about 1600s.
> > > > > > > Does Travis do incremental build for each test? Shall we
> > > > > > > increase time limit for large PR? Can we add more time
stamp
> > > > > > > for build and unites stage to
> > > > > help understand what's going on there?
> > > > > > >
> > > > > > > Thanks in advance,
> > > > > > > Zhennan
> > > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Yizhi Liu
> > > > > DMLC member
> > > > > Amazon Web Services
> > > > > Vancouver, Canada
> > > > >
> > >
> > >
> > >
> > > --
> > > Yizhi Liu
> > > DMLC member
> > > Amazon Web Services
> > > Vancouver, Canada
> >
> >
> >
> > --
> > Yizhi Liu
> > DMLC member
> > Amazon Web Services
> > Vancouver, Canada
> >
>
-- 
Yizhi Liu
DMLC member
Amazon Web Services
Vancouver, Canada

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message