airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Davydov <dan.davy...@airbnb.com.INVALID>
Subject Re: 1.7.1 release status
Date Thu, 05 May 2016 18:09:11 GMT
Moved discussion to https://issues.apache.org/jira/browse/AIRFLOW-52 and
updated the status of the task there.

On Tue, May 3, 2016 at 2:32 AM, Dan Davydov <dan.davydov@airbnb.com> wrote:

> It's per DAG unfortunately (we have some pretty funky DAGs here).
> On May 2, 2016 10:26 PM, "Bolke de Bruin" <bdbruin@gmail.com> wrote:
>
>> Hi dan
>>
>> Is that per dag or per dag bag? Multiprocessing should parallelize dag
>> parsing so I am very curious. Let me know if I can help out.
>> Bolke
>>
>> Sent from my iPhone
>>
>> > On 3 mei 2016, at 01:47, Dan Davydov <dan.davydov@airbnb.com.INVALID>
>> wrote:
>> >
>> > So a quick update, unfortunately we saw some DAGBag parsing time
>> increases
>> > (~10x for some DAGs) on the webservers with the 1.7.1rc3. Because of
>> this I
>> > will be working on a staging cluster that has a copy of our production
>> > production DAGBag, and is a copy of our production airflow
>> infrastructure,
>> > just without the workers. This will let us debug the release outside of
>> > production.
>> >
>> > On Thu, Apr 28, 2016 at 10:20 AM, Dan Davydov <dan.davydov@airbnb.com>
>> > wrote:
>> >
>> >> Definitely, here were the issues we hit:
>> >> - airbnb/airflow#1365 occured
>> >> - Webservers/scheduler were timing out and stuck in restart cycles due
>> to
>> >> increased time spent on parsing DAGs due to airbnb/airflow#1213/files
>> >> - Failed tasks that ran after the upgrade and the revert (after we
>> >> reverted the upgrade) were unable to be cleared (but running the tasks
>> >> through the UI worked without clearing them)
>> >> - The way log files were stored on S3 was changed (airflow now
>> requires a
>> >> connection to be setup) which broke log storage
>> >> - Some DAGs were broken (unable to be parsed) due to package
>> >> reorganization in open-source (the import paths were changed) (the
>> utils
>> >> refactor commit)
>> >>
>> >> On Thu, Apr 28, 2016 at 12:17 AM, Bolke de Bruin <bdbruin@gmail.com>
>> >> wrote:
>> >>
>> >>> Dan,
>> >>>
>> >>> Are you able to share some of the bugs you have been hitting and
>> >>> connected commits?
>> >>>
>> >>> We could at the very least learn from them and maybe even improve
>> testing.
>> >>>
>> >>> Bolke
>> >>>
>> >>>
>> >>>>> Op 28 apr. 2016, om 06:51 heeft Dan Davydov
>> >>>> <dan.davydov@airbnb.com.INVALID> het volgende geschreven:
>> >>>>
>> >>>> All of the blockers were fixed as of yesterday (there was some issue
>> >>> that
>> >>>> Jeremiah was looking at with the last release candidate which I
>> think is
>> >>>> fixed but I'm not sure). I started staging the airbnb_1.7.1rc3 tag
>> >>> earlier
>> >>>> today, so as long as metrics look OK and the 1.7.1rc2 issues seem
>> >>> resolved
>> >>>> tomorrow I will release internally either tomorrow or Monday (we
try
>> to
>> >>>> avoid releases on Friday). If there aren't any issues we can push
the
>> >>> 1.7.1
>> >>>> tag on Monday/Tuesday.
>> >>>>
>> >>>> @Sid
>> >>>> I think we were originally aiming to deploy internally once every
two
>> >>> weeks
>> >>>> but we decided to do it once a month in the end. I'm not too sure
>> about
>> >>>> that so Max can comment there.
>> >>>>
>> >>>> We have been running 1.7.0 in production for about a month now and
it
>> >>>> stable.
>> >>>>
>> >>>> I think what really slowed down this release cycle is some commits
>> that
>> >>>> caused severe bugs that we decided to roll-forward with instead
of
>> >>> rolling
>> >>>> back. We can potentially try reverting these commits next time while
>> the
>> >>>> fixes are applied for the next version, although this is not always
>> >>> trivial
>> >>>> to do.
>> >>>>
>> >>>> On Wed, Apr 27, 2016 at 9:31 PM, Siddharth Anand <
>> >>>> siddharthanand@yahoo.com.invalid> wrote:
>> >>>>
>> >>>>> Btw, is anyone of the committers running 1.7.0 or later in any
>> staging
>> >>> or
>> >>>>> production env? I have to say that given that 1.6.2 was the
most
>> stable
>> >>>>> release and is 4 or more months old does not say much for our
>> release
>> >>>>> cadence or process. What's our plan for 1.7.1?
>> >>>>>
>> >>>>> Sent from Sid's iPhone
>> >>>>>
>> >>>>>>> On Apr 27, 2016, at 9:05 PM, Chris Riccomini <
>> criccomini@apache.org>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>> Hey all,
>> >>>>>>
>> >>>>>> I just wanted to check in on the 1.7.1 release status. I
know there
>> >>> have
>> >>>>>> been some major-ish bugs, as well as several people doing
tests.
>> >>> Should
>> >>>>> we
>> >>>>>> create a 1.7.1 release JIRA, and track outstanding issues
there?
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>> Chris
>> >>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message