airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc3
Date Thu, 16 Feb 2017 23:06:30 GMT
PR is done and ready for proper review. Please note:

* The recursive has been removed. It made no sense in my opinion. SubDags and their tasks
are always considered. 
* It is dependent on existing dag runs. You cannot add a task instance somewhere in the middle
of no-where.
* It will create dag runs for subdags to ensure consistency.

Cheers
Bolke

> On 13 Feb 2017, at 21:46, Bolke de Bruin <bdbruin@gmail.com> wrote:
> 
> You raise a good point on the “extra_task” thing. I havent tested that. I am discussing
matters on gitter with Sid as well (although he is in a meeting now). I am holding off raising
the vote on the IPMC (*sigh* ;) ).
> 
> Bolke
> 
> 
>> On 13 Feb 2017, at 21:40, Dan Davydov <dan.davydov@airbnb.com.INVALID <mailto:dan.davydov@airbnb.com.INVALID>>
wrote:
>> 
>> I feel like there might be enough reliance on these features to merge these
>> in, e.g. mark-successing a non-existent task to prevent it from running.
>> I'm curious what others think. Also isn't mark success still needed for
>> when you add a new task with depends_on_past to an existing dag or is that
>> fixed as well?
>> 
>> On Mon, Feb 13, 2017 at 12:25 PM, Bolke de Bruin <bdbruin@gmail.com <mailto:bdbruin@gmail.com>>
wrote:
>> 
>>> A little bit more background on the issue. Mark success sits in views.py
>>> as “def success”. The code should mark a task “successful”, with optional
>>> upstream and downstream tasks as well. Even for tasks in the future (up
>>> until datetime.now() ) and past. It was often used to kick off the first of
>>> dag run for when “depends_on_past" was used. As of 1.8.0 this is not
>>> required anymore. The code is complex, lacks testing and more importantly
>>> it is outdated: it creates tasks on its own without dag runs, and is not
>>> aware of the “NONE” state. Next to that it is buggy (upstream/downstream
do
>>> the same currently ie. only downstream). Hence, in my opinion it requires
>>> refactoring which I am doing at the moment.
>>> 
>>> Two small fixes could be included in the release, but they don’t solve the
>>> root cause.
>>> 
>>> * https://github.com/apache/incubator-airflow/pull/2075 <https://github.com/apache/incubator-airflow/pull/2075>
<
>>> https://github.com/apache/incubator-airflow/pull/2075 <https://github.com/apache/incubator-airflow/pull/2075>>
>>> * https://github.com/apache/incubator-airflow/pull/2074 <https://github.com/apache/incubator-airflow/pull/2074>
<
>>> https://github.com/apache/incubator-airflow/pull/2074 <https://github.com/apache/incubator-airflow/pull/2074>>
>>> 
>>> I suggest fixing this in 1.8.1 properly. Chris :) volunteered to do 1.8.1
>>> soon after 1.8.0
>>> 
>>> Any thoughts?
>>> 
>>> Bolke
>>> 
>>>> On 13 Feb 2017, at 20:59, Bolke de Bruin <bdbruin@gmail.com <mailto:bdbruin@gmail.com>>
wrote:
>>>> 
>>>> https://github.com/apache/incubator-airflow/pull/2075 <https://github.com/apache/incubator-airflow/pull/2075>
<
>>> https://github.com/apache/incubator-airflow/pull/2075 <https://github.com/apache/incubator-airflow/pull/2075>>
>>>> 
>>>> Is (part of) the fix. I can include it retroactively if needed, but I
>>> don’t consider it blocking.
>>>> 
>>>> Bolke
>>>> 
>>>> 
>>>>> On 13 Feb 2017, at 20:56, Dan Davydov <dan.davydov@airbnb.com.INVALID
<mailto:dan.davydov@airbnb.com.INVALID>
>>> <mailto:dan.davydov@airbnb.com.INVALID <mailto:dan.davydov@airbnb.com.INVALID>>>
wrote:
>>>>> 
>>>>> Can you give more details/a repro case Sid? FWIW mark success and clear
>>>>> both work for me.
>>>>> 
>>>>> On Mon, Feb 13, 2017 at 11:51 AM, siddharth anand <sanand@apache.org
<mailto:sanand@apache.org>
>>> <mailto:sanand@apache.org <mailto:sanand@apache.org>>> wrote:
>>>>> 
>>>>>> Folks!
>>>>>> I need to change my vote.. -1 (Binding).
>>>>>> 
>>>>>> 
>>>>>> Mark Success/Clear is broken in the UI. It's a regression.
>>>>>> 
>>>>>> -s
>>>>>> 
>>>>>> On Mon, Feb 13, 2017 at 10:53 AM, Alex Van Boxel <alex@vanboxel.be
<mailto:alex@vanboxel.be>
>>> <mailto:alex@vanboxel.be <mailto:alex@vanboxel.be>>> wrote:
>>>>>> 
>>>>>>> +1 (binding)
>>>>>>> 
>>>>>>> On Mon, Feb 13, 2017 at 7:45 PM siddharth anand <sanand@apache.org
<mailto:sanand@apache.org>
>>> <mailto:sanand@apache.org <mailto:sanand@apache.org>>>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> +1 (binding)
>>>>>>>> 
>>>>>>>> On Mon, Feb 13, 2017 at 8:57 AM, Chris Riccomini <
>>>>>> criccomini@apache.org <mailto:criccomini@apache.org> <mailto:criccomini@apache.org
<mailto:criccomini@apache.org>>>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> +1 (binding)
>>>>>>>>> 
>>>>>>>>> On Sun, Feb 12, 2017 at 8:54 AM, Jeremiah Lowin <jlowin@apache.org
<mailto:jlowin@apache.org>
>>> <mailto:jlowin@apache.org <mailto:jlowin@apache.org>>>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Interesting -- I also run on Kubernetes with a git-sync
sidecar,
>>>>>> but
>>>>>>>> the
>>>>>>>>>> containers wait for the synced repo to apprar before
starting since
>>>>>>> it
>>>>>>>>>> contains some dependencies -- I assume that's why
I didn't
>>>>>> experience
>>>>>>>> the
>>>>>>>>>> same issue.
>>>>>>>>>> 
>>>>>>>>>> On Sun, Feb 12, 2017 at 6:29 AM Bolke de Bruin <bdbruin@gmail.com
<mailto:bdbruin@gmail.com>
>>> <mailto:bdbruin@gmail.com <mailto:bdbruin@gmail.com>>>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Although the race condition doesn't explain why
“num_runs = None”
>>>>>>>>>> resolved
>>>>>>>>>>> the issue for you earlier, but it does give a
clue now: the PR
>>>>>> that
>>>>>>>>>>> introduced “num_runs = -1” was there to be
able to work with
>>>>>> empty
>>>>>>>> dag
>>>>>>>>>>> dirs, maybe it wasn’t fully covered yet.
>>>>>>>>>>> 
>>>>>>>>>>> Bolke
>>>>>>>>>>> 
>>>>>>>>>>>> On 12 Feb 2017, at 12:26, Bolke de Bruin
<bdbruin@gmail.com <mailto:bdbruin@gmail.com>
>>> <mailto:bdbruin@gmail.com <mailto:bdbruin@gmail.com>>>
>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Ok great! Thanks! That sounds like a race
condition: module not
>>>>>>>>>>> available yet at time of reading. I would expect
that it resolves
>>>>>>>>> itself
>>>>>>>>>>> after a while.
>>>>>>>>>>>> 
>>>>>>>>>>>> After talking to some people at the Warsaw
BigData conf I have
>>>>>>> some
>>>>>>>>>>> ideas around syncing dags, Spoiler: no dependency
on git.
>>>>>>>>>>>> 
>>>>>>>>>>>> - Bolke
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 12 Feb 2017, at 11:17, Alex Van Boxel
<alex@vanboxel.be <mailto:alex@vanboxel.be>
>>> <mailto:alex@vanboxel.be <mailto:alex@vanboxel.be>>>
>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Running ok, in staging... @bolke I'm
running patch-less. I've
>>>>>>>>> switched
>>>>>>>>>>> my
>>>>>>>>>>>>> Kubernetes from:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - each container (webserver/scheduler/worker)
had a
>>>>>> git-sync'er
>>>>>>>>>> (getting
>>>>>>>>>>>>> the dags from git)
>>>>>>>>>>>>>> this meant that the scheduler had
0 dags at startup, and
>>>>>> should
>>>>>>>>> have
>>>>>>>>>>>>> picked them up later
>>>>>>>>>>>>> 
>>>>>>>>>>>>> to
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - single NFS share that shares airflow_home
over each
>>>>>> container
>>>>>>>>>>>>>> the git sync'er is now a seperate
container running before
>>>>>> the
>>>>>>>>> other
>>>>>>>>>>>>> containers
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This resolved my mystery DAG crashes.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'll be updating production to a patchless
RC3 today, you get
>>>>>> my
>>>>>>>>> vote
>>>>>>>>>>> after
>>>>>>>>>>>>> that.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Sun, Feb 12, 2017 at 4:59 AM Boris
Tyukin <
>>>>>>>> boris@boristyukin.com <mailto:boris@boristyukin.com>
<mailto:boris@boristyukin.com <mailto:boris@boristyukin.com>>
>>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> awesome! thanks Jeremiah
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Sat, Feb 11, 2017 at 12:53 PM,
Jeremiah Lowin <
>>>>>>>>> jlowin@apache.org <mailto:jlowin@apache.org>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Boris, I submitted a PR to address
your second point --
>>>>>>>>>>>>>>> https://github.com/apache/incubator-airflow/pull/2068
<https://github.com/apache/incubator-airflow/pull/2068>.
>>>>>>> Thanks!
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Sat, Feb 11, 2017 at 10:42
AM Boris Tyukin <
>>>>>>>>>> boris@boristyukin.com <mailto:boris@boristyukin.com>
<mailto:boris@boristyukin.com <mailto:boris@boristyukin.com>>>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I am running LocalExecutor
and not doing crazy things but
>>>>>> use
>>>>>>>> DAG
>>>>>>>>>>>>>>>> generation heavily - everything
runs fine as before. As I
>>>>>>>>> mentioned
>>>>>>>>>>> in
>>>>>>>>>>>>>>>> other threads only had a
few issues:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1) had to upgrade MySQL which
was a PAIN. Cloudera CDH is
>>>>>>>> running
>>>>>>>>>> old
>>>>>>>>>>>>>>>> version of MySQL which was
compatible with 1.7.1 but not
>>>>>>>>> compatible
>>>>>>>>>>> now
>>>>>>>>>>>>>>>> with 1.8 because of fractional
seconds support PR.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2) when you install airflow,
there are two new example DAGs
>>>>>>>>>>>>>>>> (last_task_only) which are
going back very far in the past
>>>>>>> and
>>>>>>>>>>>>>> scheduled
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> run every hour - a bunch
of dags triggered on the first
>>>>>> start
>>>>>>>> of
>>>>>>>>>>>>>>> scheduler
>>>>>>>>>>>>>>>> and hosed my CPU
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Everything else was fine
and I LOVE lots of small UI
>>>>>> changes,
>>>>>>>>> which
>>>>>>>>>>>>>>> reduced
>>>>>>>>>>>>>>>> a lot my use of cli.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks again for the amazing
work and an awesome project!
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Sat, Feb 11, 2017 at 9:17
AM, Jeremiah Lowin <
>>>>>>>>> jlowin@apache.org <mailto:jlowin@apache.org> <mailto:jlowin@apache.org
<mailto:jlowin@apache.org>>
>>>>>>>>>>> 
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I was able to deploy
successfully. +1 (binding)
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2017
at 7:37 PM Maxime Beauchemin <
>>>>>>>>>>>>>>>>> maximebeauchemin@gmail.com
<mailto:maximebeauchemin@gmail.com>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Fri, Feb 10, 2017
at 3:44 PM, Arthur Wiedmer <
>>>>>>>>>>>>>>>>> arthur.wiedmer@gmail.com
<mailto:arthur.wiedmer@gmail.com>>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Feb 10, 2017
3:13 PM, "Dan Davydov" <
>>>>>>>>> dan.davydov@airbnb.com <mailto:dan.davydov@airbnb.com>.
>>>>>>>>>>>>>>>>> invalid>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Our staging
looks good, all the DAGs there pass.
>>>>>>>>>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Fri, Feb
10, 2017 at 10:21 AM, Chris Riccomini <
>>>>>>>>>>>>>>>>>> criccomini@apache.org
<mailto:criccomini@apache.org>
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Running
in all environments. Will vote after the
>>>>>> weekend
>>>>>>>> to
>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>>>>>>>> things
are working properly, but so far so good.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On Fri,
Feb 10, 2017 at 6:05 AM, Bolke de Bruin <
>>>>>>>>>>>>>>>> bdbruin@gmail.com <mailto:bdbruin@gmail.com>
<mailto:bdbruin@gmail.com <mailto:bdbruin@gmail.com>>
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Dear
All,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Let’s
try again!
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> I
have made the THIRD RELEASE CANDIDATE of Airflow
>>>>>>> 1.8.0
>>>>>>>>>>>>>>>>> available
>>>>>>>>>>>>>>>>>>> at:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/
<https://dist.apache.org/repos/dist/dev/incubator/airflow/> <
>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/ <https://dist.apache.org/repos/dist/dev/incubator/airflow/>>
>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/
<https://dist.apache.org/repos/dist/dev/incubator/airflow/>>
>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>>> public
>>>>>>>>>>>>>>>>>>>> keys
>>>>>>>>>>>>>>>>>>>>>> are
available at https://dist.apache.org/repos/ <https://dist.apache.org/repos/>
>>>>>>>>>>>>>>>>>>> dist/release/incubator/
>>>>>>>>>>>>>>>>>>>>>> airflow/
<
>>>>>>>>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/
<https://dist.apache.org/repos/dist/release/incubator/>
>>>>>>>>>>>>>>>>>>>> airflow/>
>>>>>>>>>>>>>>>>>>>>>> .
It is tagged with a local version
>>>>>> “apache.incubating”
>>>>>>>> so
>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>> allows
>>>>>>>>>>>>>>>>>>>>>> upgrading
from earlier releases.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Two
issues have been fixed since release candidate 2:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> *
trigger_dag could create dags with fractional
>>>>>>> seconds,
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> supported
>>>>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>> logging
and UI at the moment
>>>>>>>>>>>>>>>>>>>>>> *
local api client trigger_dag had hardcoded
>>>>>> execution
>>>>>>> of
>>>>>>>>>>>>>>> None
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Known
issue:
>>>>>>>>>>>>>>>>>>>>>> *
Airflow on kubernetes and num_runs -1 (default) can
>>>>>>>>>>>>>> expose
>>>>>>>>>>>>>>>>> import
>>>>>>>>>>>>>>>>>>>>> issues.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> I
have extensively discussed this with Alex
>>>>>> (reporter)
>>>>>>>> and
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>> consider
>>>>>>>>>>>>>>>>>>>>>> this
a known issue with a workaround available as we
>>>>>>> are
>>>>>>>>>>>>>>> unable
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> replicate
this in a different environment.
>>>>>> UPDATING.md
>>>>>>>> has
>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>>>> updated
>>>>>>>>>>>>>>>>>>>>>> with
the work around.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> As
these issues are confined to a very specific area
>>>>>>> and
>>>>>>>>>>>>>> full
>>>>>>>>>>>>>>>>> unit
>>>>>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>>>>>> were
added I would also like to raise a VOTE for
>>>>>>>> releasing
>>>>>>>>>>>>>>>> 1.8.0
>>>>>>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>> release
candidate 3, i.e. just renaming release
>>>>>>>> candidate 3
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> 1.8.0
>>>>>>>>>>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Please
respond to this email by:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> +1,0,-1
with *binding* if you are a PMC member or
>>>>>>>>>>>>>>> *non-binding*
>>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>>> not.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>>>> Bolke
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> My
VOTE: +1 (binding)
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> _/
>>>>>>>>>>>>> _/ Alex Van Boxel
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> --
>>>>>>> _/
>>>>>>> _/ Alex Van Boxel
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message