airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <Jarek.Pot...@polidea.com>
Subject Re: Travis CI random failures
Date Tue, 23 Jul 2019 07:43:18 GMT
It's now pretty consistent and happens pretty much every time using the old
build system - for example here:
https://travis-ci.org/apache/airflow/builds/562435992.

I will cancel all PRs and disable automated PR build on Travis until we
solve the problem - as it is pointless - new PRs will simply queue and fail
constantly.

I opened critical infrastructure ticket:
https://issues.apache.org/jira/browse/INFRA-18787 and I am running some
additional tests - I run the builds from commit before the new CI so that I
see if another change since then could cause it.

J.


On Tue, Jul 23, 2019 at 8:55 AM Jarek Potiuk <Jarek.Potiuk@polidea.com>
wrote:

> Update2: I can confirm that the same memory/resource related issues happen
> in my Travis CI forks with reverted changes :(
> https://travis-ci.org/potiuk/airflow/builds/562430507 . I will escalate
> it to Travis/APACHE infrastructure
>
> On Tue, Jul 23, 2019 at 8:35 AM Jarek Potiuk <Jarek.Potiuk@polidea.com>
> wrote:
>
>> Update: it looks like it's Travis's problem: I reverted the CI changes
>> and we have the same CPU problem in the old build:
>> https://travis-ci.org/potiuk/airflow/jobs/562430517 .
>>
>> On Tue, Jul 23, 2019 at 8:32 AM Jarek Potiuk <Jarek.Potiuk@polidea.com>
>> wrote:
>>
>>> Hello everyone,
>>>
>>> We've started to experience some random failures on Travis relaated to
>>> lack of resources: those are either Out of Memory errors or lack of CPUS to
>>> run Kubernetes builds.
>>>
>>> I tried to rerun those, thinking it was an intermittent error. It
>>> started happening yesterday and I have not seen it before so I rather doubt
>>> it is related to the latest changes.
>>>
>>> But I do not want to risk everyone being blocked so I am testing now on
>>> my own fork if reverting the latest CI changes help. I will let you know
>>> and will revert in case I found old CI works in a stable way.
>>>
>>> In the meantime - I will cancel all outstanding builds  that are
>>> blocking our queue and will test it both old CI and new CI in our fork :(
>>> (Travis queue limit is not helping).
>>>
>>> Can you please hold on with rebasing/pushing new PRs until I check it.
>>>
>>> Example failures:
>>>
>>>
>>>    - OSError: [Errno 12] Cannot allocate memory (
>>>    https://travis-ci.org/apache/airflow/jobs/562395978)
>>>    - [ERROR NumCPU]: the number of available CPUs 1 is less than the
>>>    required 2 (https://travis-ci.org/apache/airflow/jobs/562395978)
>>>
>>>
>>> J.
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message