airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Davydov <dan.davy...@airbnb.com.INVALID>
Subject Re: [RESULT] [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc4
Date Thu, 23 Feb 2017 20:25:20 GMT
Here is the DAG: http://imgur.com/a/zXXsS

On Thu, Feb 23, 2017 at 12:18 PM, Arthur Wiedmer <arthur.wiedmer@gmail.com>
wrote:

> Dan,
>
> Inline images get stripped by the mailing server. You will have to upload
> to imgur or something.
>
> Best
> Arthur
>
> On Feb 23, 2017 12:13 PM, "Dan Davydov" <dan.davydov@airbnb.com.invalid>
> wrote:
>
> > Here is an example for 1, you can see that there are some white tasks
> that
> > should have been run. I don't have time to create a skeleton DAG at the
> > moment unfortunately because of release-related firefighting. Will
> > hopefully post back here later once firefighting is done.
> > [image: Inline image 1]
> >
> > On Thu, Feb 23, 2017 at 12:00 PM, Bolke de Bruin <bdbruin@gmail.com>
> > wrote:
> >
> >> Hey Dan, Alex,
> >>
> >> Indeed #1 seems serious, specifically the the second part - skipping the
> >> root task (root task of the whole DAG?). Do you have a skeleton DAG that
> >> exposes the issue? Is there a root cause analysis? When was the issue
> >> introduced? On the the issue Alex mentioned, we don’t see that and I
> cannot
> >> really align the description of the issue with the PR yet, ie. I need
> >> clarification.
> >>
> >> Obviously, I’m not very happy if we indeed need to retract the release
> as
> >> we are ~12 hours away from closing of the vote at the IPMC mailinglist
> >> (strangely enough no one has voted yet). However, if it is that serious
> >> that it cannot wait for 1.8.1 then we need to do it. I would define
> >> “serious” as many people are going to be affected by it and they will
> not
> >> have a workaround available to them (ie. patching code or database), but
> >> the opinion of the community might differ.
> >>
> >> Cheers
> >> Bolke
> >>
> >> P.S. I am also interested in #3, as it sounds like a integrity issue
> >> (which verify_integrity should catch) but also maybe too strong a
> >> assumption that such a task should exist (ie. a task was added to a Dag
> in
> >> a later stage).
> >>
> >>
> >> > On 23 Feb 2017, at 20:15, Dan Davydov <dan.davydov@airbnb.com.
> INVALID>
> >> wrote:
> >> >
> >> > Some more issues found by our users in addition to the one Alex
> reported
> >> > and the UI issue when a dagrun doesn't have a start date:
> >> > 1. If a task fails it fails the whole dagrun immediately fails, this
> is
> >> a
> >> > very large change to how control flow works as the rest of the tasks
> in
> >> the
> >> > DAG are not run (even e.g. leaf tasks). The same is true of the
> skipped
> >> > status (if a leaf task is skipped then the root task for the DAG will
> >> get
> >> > skipped and none of the other tasks in the DAG will run).
> >> > 2. The black squares in the UI for tasks that aren't ready to run yet
> >> are
> >> > confusing and make it hard for users to see which tasks haven't run
> yet
> >> > (lower contrast). We should never initialize tasks in the DB that do
> not
> >> > have a state (or at the least these should be white).
> >> > 3. The Dagrun has a get_task_instance method that will fail if a
> dagrun
> >> > doesn't have a copy of a task instance created which we have seen
> happen
> >> > for some DAGs. This prevents those tasks from getting scheduled.
> >> >
> >> > I already patched 3 (and have a PR in flight for open source), and am
> >> > working on a patch for 1 internally. 1 should be a blocker for
> >> releasing.
> >> >
> >> > On Wed, Feb 22, 2017 at 4:38 PM, Alex Guziel <alex.guziel@airbnb.com
> >> .invalid
> >> >> wrote:
> >> >
> >> >> I have some concern that this change
> >> >> https://github.com/apache/incubator-airflow/pull/1939
> >> >> [AIRFLOW-679] may be having issues because we are seeing lots of
> double
> >> >> triggers
> >> >> of tasks and tasks being killed as a result.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Feb 22, 2017 4:35 PM, Dan Davydov
> >> dan.davydov@airbnb.com.INVALID
> >> >> wrote:
> >> >> Bumping the thread so another user can comment.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Feb 22, 2017 at 3:12 PM, Maxime Beauchemin <
> >> >>
> >> >> maximebeauchemin@gmail.com> wrote:
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>> What I meant to ask is "how much engineering effort it takes to
> bake a
> >> >>
> >> >>> single RC?", I guess it depends on how much git-fu is necessary
plus
> >> some
> >> >>
> >> >>> overhead cost of doing the series of actions/commands/emails/jira.
> >> >>
> >> >>>
> >> >>
> >> >>> I can volunteer for 1.8.1 (hopefully I can get do it along another
> >> Airbnb
> >> >>
> >> >>> engineer/volunteer to tag along) and will try to document/automate
> >> >>
> >> >>> everything I can as I go through the process. The goal of 1.8.1
> could
> >> be
> >> >> to
> >> >>
> >> >>> basically package 1.8.0 + Dan's bugfix, and for Airbnb to get
> familiar
> >> >> with
> >> >>
> >> >>> the process.
> >> >>
> >> >>>
> >> >>
> >> >>> It'd be great if you can dump your whole process on the wiki, and
> >> we'll
> >> >>
> >> >>> improve it on this next pass.
> >> >>
> >> >>>
> >> >>
> >> >>> Thanks again for the mountain of work that went into packaging
this
> >> >>
> >> >>> release.
> >> >>
> >> >>>
> >> >>
> >> >>> Max
> >> >>
> >> >>>
> >> >>
> >> >>> On Wed, Feb 22, 2017 at 2:44 PM, Bolke de Bruin <bdbruin@gmail.com>
> >> >> wrote:
> >> >>
> >> >>>
> >> >>
> >> >>>> I thought you volunteered to baby sit 1.8.1 Chris ;-)?
> >> >>
> >> >>>>
> >> >>
> >> >>>> Sent from my iPhone
> >> >>
> >> >>>>
> >> >>
> >> >>>>> On 22 Feb 2017, at 23:31, Chris Riccomini <criccomini@apache.org>
> >> >>
> >> >>> wrote:
> >> >>
> >> >>>>>
> >> >>
> >> >>>>> I'm +1 for doing a 1.8.1 fast follow-on
> >> >>
> >> >>>>>
> >> >>
> >> >>>>> On Wed, Feb 22, 2017 at 2:26 PM, Maxime Beauchemin <
> >> >>
> >> >>>>> maximebeauchemin@gmail.com> wrote:
> >> >>
> >> >>>>>
> >> >>
> >> >>>>>> Our database may have edge cases that could be associated
with
> >> >> running
> >> >>
> >> >>>> any
> >> >>
> >> >>>>>> previous version that may or may not have been part
of an
> official
> >> >>
> >> >>>> release.
> >> >>
> >> >>>>>>
> >> >>
> >> >>>>>> Let's see if anyone else reports the issue. If no one
does, one
> >> >> option
> >> >>
> >> >>>> is
> >> >>
> >> >>>>>> to release 1.8.0 as is with a comment in the release
notes, and
> >> >> have a
> >> >>
> >> >>>>>> future official minor apache release 1.8.1 that would
fix these
> >> >> minor
> >> >>
> >> >>>>>> issues that are not deal breaker.
> >> >>
> >> >>>>>>
> >> >>
> >> >>>>>> @bolke, I'm curious, how long does it take you to go
through one
> >> >>
> >> >>> release
> >> >>
> >> >>>>>> cycle? Oh, and do you have a documented step by step
process for
> >> >>
> >> >>>> releasing?
> >> >>
> >> >>>>>> I'd like to add the Pypi part to this doc and add committers
that
> >> >> are
> >> >>
> >> >>>>>> interested to have rights on the project on Pypi.
> >> >>
> >> >>>>>>
> >> >>
> >> >>>>>> Max
> >> >>
> >> >>>>>>
> >> >>
> >> >>>>>>> On Wed, Feb 22, 2017 at 2:00 PM, Bolke de Bruin
<
> >> bdbruin@gmail.com
> >> >>>
> >> >>
> >> >>>> wrote:
> >> >>
> >> >>>>>>>
> >> >>
> >> >>>>>>> So it is a database integrity issue? Afaik a start_date
should
> >> >> always
> >> >>
> >> >>>> be
> >> >>
> >> >>>>>>> set for a DagRun (create_dagrun) does so I didn't
check the code
> >> >>
> >> >>>> though.
> >> >>
> >> >>>>>>>
> >> >>
> >> >>>>>>> Sent from my iPhone
> >> >>
> >> >>>>>>>
> >> >>
> >> >>>>>>>> On 22 Feb 2017, at 22:19, Dan Davydov <dan.davydov@airbnb.com.
> >> >>
> >> >>>> INVALID>
> >> >>
> >> >>>>>>> wrote:
> >> >>
> >> >>>>>>>>
> >> >>
> >> >>>>>>>> Should clarify this occurs when a dagrun does
not have a start
> >> >> date,
> >> >>
> >> >>>>>> not
> >> >>
> >> >>>>>>> a
> >> >>
> >> >>>>>>>> dag (which makes it even less likely to happen).
I don't think
> >> >> this
> >> >>
> >> >>> is
> >> >>
> >> >>>>>> a
> >> >>
> >> >>>>>>>> blocker for releasing.
> >> >>
> >> >>>>>>>>
> >> >>
> >> >>>>>>>>> On Wed, Feb 22, 2017 at 1:15 PM, Dan Davydov
<
> >> >>
> >> >>> dan.davydov@airbnb.com
> >> >>
> >> >>>>>
> >> >>
> >> >>>>>>> wrote:
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>> I rolled this out in our prod and the webservers
failed to
> load
> >> >> due
> >> >>
> >> >>>> to
> >> >>
> >> >>>>>>>>> this commit:
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>> [AIRFLOW-510] Filter Paused Dags, show
Last Run & Trigger Dag
> >> >>
> >> >>>>>>>>> 7c94d81c390881643f94d5e3d7d6fb351a445b72
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>> This fixed it:
> >> >>
> >> >>>>>>>>> - </a> <span id="statuses_info"
> >> >>
> >> >>>>>>>>> class="glyphicon glyphicon-info-sign" aria-hidden="true"
> >> >>
> >> >>> title="Start
> >> >>
> >> >>>>>>> Date:
> >> >>
> >> >>>>>>>>> {{last_run.start_date.strftime('%Y-%m-%d
%H:%M')}}"></span>
> >> >>
> >> >>>>>>>>> + </a> <span id="statuses_info"
> >> >>
> >> >>>>>>>>> class="glyphicon glyphicon-info-sign"
> aria-hidden="true"></span>
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>> This is caused by assuming that all DAGs
have start dates set,
> >> >> so a
> >> >>
> >> >>>>>>> broken
> >> >>
> >> >>>>>>>>> DAG will take down the whole UI. Not sure
if we want to make
> >> >> this a
> >> >>
> >> >>>>>>> blocker
> >> >>
> >> >>>>>>>>> for the release or not, I'm guessing for
most deployments this
> >> >>
> >> >>> would
> >> >>
> >> >>>>>>> occur
> >> >>
> >> >>>>>>>>> pretty rarely. I'll submit a PR to fix
it soon.
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>> On Tue, Feb 21, 2017 at 9:49 AM, Chris
Riccomini <
> >> >>
> >> >>>>>> criccomini@apache.org
> >> >>
> >> >>>>>>>>
> >> >>
> >> >>>>>>>>> wrote:
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>>> Ack that the vote has already passed,
but belated +1
> (binding)
> >> >>
> >> >>>>>>>>>>
> >> >>
> >> >>>>>>>>>> On Tue, Feb 21, 2017 at 7:42 AM, Bolke
de Bruin <
> >> >>
> >> >>> bdbruin@gmail.com>
> >> >>
> >> >>>>>>>>>> wrote:
> >> >>
> >> >>>>>>>>>>
> >> >>
> >> >>>>>>>>>>> IPMC Voting can be found here:
> >> >>
> >> >>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/
> >> >>
> >> >>>>>>>>>> 201702.mbox/%
> >> >>
> >> >>>>>>>>>>> 3c676BDC9F-1B55-4469-92A7-9FF309AD0EC8@gmail.com%3e
<
> >> >>
> >> >>>>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-general/
> >> >>
> >> >>>>>>>>>> 201702.mbox/%
> >> >>
> >> >>>>>>>>>>> 3C676BDC9F-1B55-4469-92A7-9FF309AD0EC8@gmail.com%3E>
> >> >>
> >> >>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>> Kind regards,
> >> >>
> >> >>>>>>>>>>> Bolke
> >> >>
> >> >>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> On 21 Feb 2017, at 08:20, Bolke
de Bruin <
> bdbruin@gmail.com>
> >> >>
> >> >>>>>> wrote:
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> Hello,
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> Apache Airflow (incubating)
1.8.0 (based on RC4) has been
> >> >>
> >> >>>> accepted.
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> 9 “+1” votes received:
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> - Maxime Beauchemin (binding)
> >> >>
> >> >>>>>>>>>>>> - Arthur Wiedmer (binding)
> >> >>
> >> >>>>>>>>>>>> - Dan Davydov (binding)
> >> >>
> >> >>>>>>>>>>>> - Jeremiah Lowin (binding)
> >> >>
> >> >>>>>>>>>>>> - Siddharth Anand (binding)
> >> >>
> >> >>>>>>>>>>>> - Alex van Boxel (binding)
> >> >>
> >> >>>>>>>>>>>> - Bolke de Bruin (binding)
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> - Jayesh Senjaliya (non-binding)
> >> >>
> >> >>>>>>>>>>>> - Yi (non-binding)
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> Vote thread (start):
> >> >>
> >> >>>>>>>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-
> >> >>
> >> >>>>>>>>>>> airflow-dev/201702.mbox/%3cD360D9BE-C358-42A1-9188-
> >> >>
> >> >>>>>>>>>>> 6C92C31A2F8B@gmail.com%3e <http://mail-archives.apache.
> >> >>
> >> >>>>>>>>>>> org/mod_mbox/incubator-airflow-dev/201702.mbox/%3C7EB7B6D6-
> >> >>
> >> >>>>>>>>>> 092E-48D2-AA0F-
> >> >>
> >> >>>>>>>>>>> 15F44376A8FF@gmail.com%3E>
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> Next steps:
> >> >>
> >> >>>>>>>>>>>> 1) will start the voting process
at the IPMC mailinglist. I
> >> do
> >> >>
> >> >>>>>> expect
> >> >>
> >> >>>>>>>>>>> some changes to be required mostly
in documentation maybe a
> >> >>
> >> >>> license
> >> >>
> >> >>>>>>> here
> >> >>
> >> >>>>>>>>>>> and there. So, we might end up
with changes to stable. As
> long
> >> >> as
> >> >>
> >> >>>>>>> these
> >> >>
> >> >>>>>>>>>> are
> >> >>
> >> >>>>>>>>>>> not (significant) code changes
I will not re-raise the vote.
> >> >>
> >> >>>>>>>>>>>> 2) Only after the positive
voting on the IPMC and
> >> >> finalisation I
> >> >>
> >> >>>>>> will
> >> >>
> >> >>>>>>>>>>> rebrand the RC to Release.
> >> >>
> >> >>>>>>>>>>>> 3) I will upload it to the
incubator release page, then the
> >> >> tar
> >> >>
> >> >>>>>> ball
> >> >>
> >> >>>>>>>>>>> needs to propagate to the mirrors.
> >> >>
> >> >>>>>>>>>>>> 4) Update the website (can
someone volunteer please?)
> >> >>
> >> >>>>>>>>>>>> 5) Finally, I will ask Maxime
to upload it to pypi. It
> seems
> >> >> we
> >> >>
> >> >>>> can
> >> >>
> >> >>>>>>>>>> keep
> >> >>
> >> >>>>>>>>>>> the apache branding as lib cloud
is doing this as well (
> >> >>
> >> >>>>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package
<
> >> >>
> >> >>>>>>>>>>> https://libcloud.apache.org/downloads.html#pypi-package>).
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> Jippie!
> >> >>
> >> >>>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>> Bolke
> >> >>
> >> >>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>>
> >> >>
> >> >>>>>>>>>>
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>>>
> >> >>
> >> >>>>>>>
> >> >>
> >> >>>>>>
> >> >>
> >> >>>>
> >> >>
> >> >>>
> >> >>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message