airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <Jarek.Pot...@polidea.com>
Subject Re: Travis CI random failures
Date Tue, 23 Jul 2019 21:29:43 GMT
Yep. We are back. We still have a slow queue, but at least the builds are
not failing randomly. Please rebase your builds on top of the current
master and push again to trigger builds.

On Tue, Jul 23, 2019 at 9:31 PM Jarek Potiuk <Jarek.Potiuk@polidea.com>
wrote:

> Looks like they fixed it: https://github.com/travis-ci/worker/issues/604
>
> On Tue, Jul 23, 2019 at 8:38 PM Driesprong, Fokko <fokko@driesprong.frl>
> wrote:
>
>> I see issues at different Apache projects as well, Druid and Avro. They're
>> running out of memory. Let's see how Travis responds.
>>
>> Cheers, Fokko
>>
>> Op di 23 jul. 2019 om 19:43 schreef Jarek Potiuk <
>> Jarek.Potiuk@polidea.com>:
>>
>> > FYI. Still not fixed. Others experience this as well:
>> > https://github.com/travis-ci/worker/issues/604
>> >
>> > On Tue, Jul 23, 2019 at 11:34 AM Jarek Potiuk <Jarek.Potiuk@polidea.com
>> >
>> > wrote:
>> >
>> > > No good news yet. We are getting randomly assigned 1CPU /3.5GB mem
>> > > instances still. Infrastructure is on it.
>> > >
>> > > On Tue, Jul 23, 2019 at 10:49 AM Jarek Potiuk <
>> Jarek.Potiuk@polidea.com>
>> > > wrote:
>> > >
>> > >> It looks like we are back to the original specs. I am runnning tests
>> and
>> > >> re-enable everything if I see it works.
>> > >>
>> > >> J.
>> > >>
>> > >> On Tue, Jul 23, 2019 at 10:34 AM Jarek Potiuk <
>> Jarek.Potiuk@polidea.com
>> > >
>> > >> wrote:
>> > >>
>> > >>> From INFRA: "I have confirmed that our builds appear to be running
>> with
>> > >>> 3.75GB memory and 1 core currently. This does not match Travis'
>> > standard
>> > >>> specs (7.5GB and 2 cores), and I have raised a ticket with their
>> > support. I
>> > >>> will respond when we hear back from Travis."
>> > >>>
>> > >>>
>> > >>> On Tue, Jul 23, 2019 at 10:26 AM Jarek Potiuk <
>> > Jarek.Potiuk@polidea.com>
>> > >>> wrote:
>> > >>>
>> > >>>> It's definitely confirmed that the problem is on Travis CI
side:
>> > >>>>
>> > >>>> I re-run the commit before the new CI was introduced (I
>> cherry-picked
>> > a
>> > >>>> small doc fix related to recent sphinx dependency update) and
it
>> > fails in
>> > >>>> exactly the same way (memory and cpu problems):
>> > >>>> https://travis-ci.org/apache/airflow/builds/562450592.
>> > >>>>
>> > >>>> For now I cannot do much but wait for the INFRA's response
(and
>> work
>> > on
>> > >>>> GitLab CI replacement of Travis).
>> > >>>>
>> > >>>> I recommend to bring some pop-corn. It's going to be an interesting
>> > one
>> > >>>> to watch.
>> > >>>>
>> > >>>> J.
>> > >>>>
>> > >>>> On Tue, Jul 23, 2019 at 9:43 AM Jarek Potiuk <
>> > Jarek.Potiuk@polidea.com>
>> > >>>> wrote:
>> > >>>>
>> > >>>>> It's now pretty consistent and happens pretty much every
time
>> using
>> > >>>>> the old build system - for example here:
>> > >>>>> https://travis-ci.org/apache/airflow/builds/562435992.
>> > >>>>>
>> > >>>>> I will cancel all PRs and disable automated PR build on
Travis
>> until
>> > >>>>> we solve the problem - as it is pointless - new PRs will
simply
>> > queue and
>> > >>>>> fail constantly.
>> > >>>>>
>> > >>>>> I opened critical infrastructure ticket:
>> > >>>>> https://issues.apache.org/jira/browse/INFRA-18787 and I
am
>> running
>> > >>>>> some additional tests - I run the builds from commit before
the
>> new
>> > CI so
>> > >>>>> that I see if another change since then could cause it.
>> > >>>>>
>> > >>>>> J.
>> > >>>>>
>> > >>>>>
>> > >>>>> On Tue, Jul 23, 2019 at 8:55 AM Jarek Potiuk <
>> > Jarek.Potiuk@polidea.com>
>> > >>>>> wrote:
>> > >>>>>
>> > >>>>>> Update2: I can confirm that the same memory/resource
related
>> issues
>> > >>>>>> happen in my Travis CI forks with reverted changes
:(
>> > >>>>>> https://travis-ci.org/potiuk/airflow/builds/562430507
. I will
>> > >>>>>> escalate it to Travis/APACHE infrastructure
>> > >>>>>>
>> > >>>>>> On Tue, Jul 23, 2019 at 8:35 AM Jarek Potiuk <
>> > >>>>>> Jarek.Potiuk@polidea.com> wrote:
>> > >>>>>>
>> > >>>>>>> Update: it looks like it's Travis's problem: I
reverted the CI
>> > >>>>>>> changes and we have the same CPU problem in the
old build:
>> > >>>>>>> https://travis-ci.org/potiuk/airflow/jobs/562430517
.
>> > >>>>>>>
>> > >>>>>>> On Tue, Jul 23, 2019 at 8:32 AM Jarek Potiuk <
>> > >>>>>>> Jarek.Potiuk@polidea.com> wrote:
>> > >>>>>>>
>> > >>>>>>>> Hello everyone,
>> > >>>>>>>>
>> > >>>>>>>> We've started to experience some random failures
on Travis
>> > relaated
>> > >>>>>>>> to lack of resources: those are either Out
of Memory errors or
>> > lack of CPUS
>> > >>>>>>>> to run Kubernetes builds.
>> > >>>>>>>>
>> > >>>>>>>> I tried to rerun those, thinking it was an
intermittent error.
>> It
>> > >>>>>>>> started happening yesterday and I have not
seen it before so I
>> > rather doubt
>> > >>>>>>>> it is related to the latest changes.
>> > >>>>>>>>
>> > >>>>>>>> But I do not want to risk everyone being blocked
so I am
>> testing
>> > >>>>>>>> now on my own fork if reverting the latest
CI changes help. I
>> > will let you
>> > >>>>>>>> know and will revert in case I found old CI
works in a stable
>> way.
>> > >>>>>>>>
>> > >>>>>>>> In the meantime - I will cancel all outstanding
builds  that
>> are
>> > >>>>>>>> blocking our queue and will test it both old
CI and new CI in
>> our
>> > fork :(
>> > >>>>>>>> (Travis queue limit is not helping).
>> > >>>>>>>>
>> > >>>>>>>> Can you please hold on with rebasing/pushing
new PRs until I
>> check
>> > >>>>>>>> it.
>> > >>>>>>>>
>> > >>>>>>>> Example failures:
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>>    - OSError: [Errno 12] Cannot allocate memory
(
>> > >>>>>>>>    https://travis-ci.org/apache/airflow/jobs/562395978)
>> > >>>>>>>>    - [ERROR NumCPU]: the number of available
CPUs 1 is less
>> than
>> > >>>>>>>>    the required 2 (
>> > >>>>>>>>    https://travis-ci.org/apache/airflow/jobs/562395978)
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> J.
>> > >>>>>>>>
>> > >>>>>>>> --
>> > >>>>>>>>
>> > >>>>>>>> Jarek Potiuk
>> > >>>>>>>> Polidea <https://www.polidea.com/> |
Principal Software
>> Engineer
>> > >>>>>>>>
>> > >>>>>>>> M: +48 660 796 129 <+48660796129>
>> > >>>>>>>> [image: Polidea] <https://www.polidea.com/>
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>
>> > >>>>>>> --
>> > >>>>>>>
>> > >>>>>>> Jarek Potiuk
>> > >>>>>>> Polidea <https://www.polidea.com/> | Principal
Software
>> Engineer
>> > >>>>>>>
>> > >>>>>>> M: +48 660 796 129 <+48660796129>
>> > >>>>>>> [image: Polidea] <https://www.polidea.com/>
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>
>> > >>>>>> --
>> > >>>>>>
>> > >>>>>> Jarek Potiuk
>> > >>>>>> Polidea <https://www.polidea.com/> | Principal
Software Engineer
>> > >>>>>>
>> > >>>>>> M: +48 660 796 129 <+48660796129>
>> > >>>>>> [image: Polidea] <https://www.polidea.com/>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>
>> > >>>>> --
>> > >>>>>
>> > >>>>> Jarek Potiuk
>> > >>>>> Polidea <https://www.polidea.com/> | Principal Software
Engineer
>> > >>>>>
>> > >>>>> M: +48 660 796 129 <+48660796129>
>> > >>>>> [image: Polidea] <https://www.polidea.com/>
>> > >>>>>
>> > >>>>>
>> > >>>>
>> > >>>> --
>> > >>>>
>> > >>>> Jarek Potiuk
>> > >>>> Polidea <https://www.polidea.com/> | Principal Software
Engineer
>> > >>>>
>> > >>>> M: +48 660 796 129 <+48660796129>
>> > >>>> [image: Polidea] <https://www.polidea.com/>
>> > >>>>
>> > >>>>
>> > >>>
>> > >>> --
>> > >>>
>> > >>> Jarek Potiuk
>> > >>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > >>>
>> > >>> M: +48 660 796 129 <+48660796129>
>> > >>> [image: Polidea] <https://www.polidea.com/>
>> > >>>
>> > >>>
>> > >>
>> > >> --
>> > >>
>> > >> Jarek Potiuk
>> > >> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > >>
>> > >> M: +48 660 796 129 <+48660796129>
>> > >> [image: Polidea] <https://www.polidea.com/>
>> > >>
>> > >>
>> > >
>> > > --
>> > >
>> > > Jarek Potiuk
>> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > >
>> > > M: +48 660 796 129 <+48660796129>
>> > > [image: Polidea] <https://www.polidea.com/>
>> > >
>> > >
>> >
>> > --
>> >
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/>
>> >
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message