airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5
Date Thu, 16 Mar 2017 15:53:42 GMT
I agree that it is not nice. I suggest that we revisit our fix and see if we can do better
there (rather than adding new complexity). And get this into 1.8.1.

Nevertheless, I consider the vote passed.

Bolke

> On 15 Mar 2017, at 19:12, Dan Davydov <dan.davydov@airbnb.com.INVALID> wrote:
> 
> The only thing is that this is a change in semantics and changing semantics
> (breaking some DAGs) and then changing them back (and breaking things
> again) isn't great.
> 
> On Wed, Mar 15, 2017 at 7:02 PM, Bolke de Bruin <bdbruin@gmail.com> wrote:
> 
>> Indeed that could be the case. Let's get 1.8.0 out the door so we can
>> focus on these bug fixes for 1.8.1.
>> 
>> Bolke
>> 
>> Sent from my iPhone
>> 
>>> On 15 Mar 2017, at 18:25, Dan Davydov <dan.davydov@airbnb.com.INVALID>
>> wrote:
>>> 
>>> Another issue we are seeing is
>>> https://issues.apache.org/jira/browse/AIRFLOW-992 - tasks that have both
>>> skipped children and successful children are run instead of skipped. Not
>>> blocking the release on this just letting you guys know for the release
>> bug
>>> notes. We will be cherrypicking a fix for this onto our production when
>> we
>>> release 1.8 once we come up with one.
>>> 
>>> It's possibly thought not necessarily related to an incomplete/incorrect
>>> fix of https://issues.apache.org/jira/browse/AIRFLOW-719 .
>>> 
>>>> On Wed, Mar 15, 2017 at 4:53 PM, siddharth anand <sanand@apache.org>
>> wrote:
>>>> 
>>>> Confirmed that Bolke's PR above fixes the issue.
>>>> 
>>>> Also, I agree this is not a blocker for the current airflow release, so
>> my
>>>> +1 (binding) stands.
>>>> -s
>>>> 
>>>>> On Wed, Mar 15, 2017 at 3:11 PM, Bolke de Bruin <bdbruin@gmail.com>
>> wrote:
>>>>> 
>>>>> PR is available: https://github.com/apache/incubator-airflow/pull/2154
>>>>> 
>>>>> But marked for 1.8.1.
>>>>> 
>>>>> - Bolke
>>>>> 
>>>>>> On 15 Mar 2017, at 14:37, Bolke de Bruin <bdbruin@gmail.com>
wrote:
>>>>>> 
>>>>>> On second thought I do consider it a bug and can have a fix out pretty
>>>>> quickly, but I don’t consider it a blocker.
>>>>>> 
>>>>>> - B.
>>>>>> 
>>>>>>> On 15 Mar 2017, at 14:21, Bolke de Bruin <bdbruin@gmail.com>
wrote:
>>>>>>> 
>>>>>>> Just to be clear: Also in 1.7.1 the DagRun was marked successful,
but
>>>>> its tasks continued to be scheduled. So one could also consider 1.7.1
>>>>> behaviour a bug. I am not sure here, but I think it kind of makes sense
>>>> to
>>>>> consider the behaviour of 1.7.1 a bug. It has been present throughout
>> all
>>>>> the 1.8 rc/beta/apha series.
>>>>>>> 
>>>>>>> So yes it is a change in behaviour whether it is a regression
or an
>>>>> integrity improvement is up for discussion. Either way I don’t consider
>>>> it
>>>>> a blocker.
>>>>>>> 
>>>>>>> Bolke.
>>>>>>> 
>>>>>>>> On 15 Mar 2017, at 14:06, siddharth anand <sanand@apache.org>
>> wrote:
>>>>>>>> 
>>>>>>>> Here's the JIRA :
>>>>>>>> https://issues.apache.org/jira/browse/AIRFLOW-989
>>>>>>>> 
>>>>>>>> I confirmed it is a regression from 1.7.1.3, which I installed
via
>>>> pip
>>>>> and
>>>>>>>> tested against the same DAG in the JIRA.
>>>>>>>> 
>>>>>>>> The issue occurs if a leaf / last / terminal downstream task
is not
>>>>>>>> cleared. You won't see this issue if you clear the entire
DAG Run or
>>>>> clear
>>>>>>>> a task and all of its downstream tasks. If you truly want
to only
>>>>> clear and
>>>>>>>> rerun a task, but not its downstream tasks, you can use the
CLI to
>>>>> execute
>>>>>>>> a specific task (e.g. vial airflow run).
>>>>>>>> 
>>>>>>>> This is a change in behavior -- if we do go ahead with the
release,
>>>>> then
>>>>>>>> this JIRA should be in a list of JIRAs of known issues related
to
>> the
>>>>> new
>>>>>>>> version.
>>>>>>>> -s
>>>>>>>> 
>>>>>>>> On Wed, Mar 15, 2017 at 9:17 AM, Chris Riccomini <
>>>>> criccomini@apache.org>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> @Sid, does this happen if you clear downstream as well?
>>>>>>>>> 
>>>>>>>>> On Wed, Mar 15, 2017 at 9:04 AM, Chris Riccomini <
>>>>> criccomini@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Has anyone been able to reproduce Sid's issue?
>>>>>>>>>> 
>>>>>>>>>> On Tue, Mar 14, 2017 at 11:17 PM, Bolke de Bruin
<
>>>> bdbruin@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> That is not an airflow error, but a Kerberos
error. Try executing
>>>>> the
>>>>>>>>>>> kinit command on the command line by yourself.
>>>>>>>>>>> 
>>>>>>>>>>> Bolke
>>>>>>>>>>> 
>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>> 
>>>>>>>>>>>> On 14 Mar 2017, at 23:11, Ruslan Dautkhanov
<
>>>> dautkhanov@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> `airflow kerberos` is broken in 1.8-rc5
>>>>>>>>>>>> https://issues.apache.org/jira/browse/AIRFLOW-987
>>>>>>>>>>>> Hopefully fix can be part of the 1.8 release.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Ruslan Dautkhanov
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 6:19 PM, siddharth
anand <
>>>>> sanand@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> FYI,
>>>>>>>>>>>>> I've just hit a major bug in the release
candidate related to
>>>>> "clear
>>>>>>>>>>> task"
>>>>>>>>>>>>> behavior.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've been running airflow in both stage
and prod since
>> yesterday
>>>>> on
>>>>>>>>>>> rc5 and
>>>>>>>>>>>>> have reproduced this in both environments.
I will file a JIRA
>>>> for
>>>>>>>>> this
>>>>>>>>>>>>> tonight, but wanted to send a note over
email as well.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In my example, I have a 2 task DAG. For
a given DAG run that
>> has
>>>>>>>>>>> completed
>>>>>>>>>>>>> successfully, if I
>>>>>>>>>>>>> 1) clear task2 (leaf task in this case),
the
>>>> previously-successful
>>>>>>>>> DAG
>>>>>>>>>>> Run
>>>>>>>>>>>>> goes back to Running, requeues, and executes
the task
>>>>> successfully.
>>>>>>>>>>> The DAG
>>>>>>>>>>>>> Run the returns from Running to Success.
>>>>>>>>>>>>> 2) clear task1 (root task in this case),
the
>>>> previously-successful
>>>>>>>>> DAG
>>>>>>>>>>> Run
>>>>>>>>>>>>> goes back to Running, DOES NOT requeue
or execute the task at
>>>> all.
>>>>>>>>> The
>>>>>>>>>>> DAG
>>>>>>>>>>>>> Run the returns from Running to Success
though it never ran the
>>>>> task.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1) is expected and previous behavior.
2) is a regression.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The only workaround is to use the CLI
to run the task cleared.
>>>>> Here
>>>>>>>>> are
>>>>>>>>>>>>> some images :
>>>>>>>>>>>>> *After Clearing the Tasks*
>>>>>>>>>>>>> https://www.dropbox.com/s/wmuxt0krwx6wurr/Screenshot%
>>>>>>>>>>>>> 202017-03-14%2014.09.34.png?dl=0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> *After DAG Runs return to Success*
>>>>>>>>>>>>> https://www.dropbox.com/s/qop933rzgdzchpd/Screenshot%
>>>>>>>>>>>>> 202017-03-14%2014.09.49.png?dl=0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This is a major regression because it
will force everyone to
>> use
>>>>> the
>>>>>>>>>>> CLI
>>>>>>>>>>>>> for things that they would normally use
the UI for.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -s
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -s
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 1:32 PM,
Daniel Huang <
>>>> dxhuang@gmail.com
>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> +1 (non-binding)!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 11:35 AM,
siddharth anand <
>>>>>>>>> sanand@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:42
AM, Maxime Beauchemin <
>>>>>>>>>>>>>>> maximebeauchemin@gmail.com>
wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 3:59
AM, Alex Van Boxel <
>>>>> alex@vanboxel.be
>>>>>>>>>> 
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Note: we had to revert
all our ONE_SUCCESS with ALL_SUCCESS
>>>>>>>>> trigger
>>>>>>>>>>>>>>> rules
>>>>>>>>>>>>>>>>> where the parent nodes
where joining with a SKIP. But I can
>>>> of
>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>> known this was coming.
Apart of that I had a successful run
>>>>> last
>>>>>>>>>>>>>> night.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Tue, Mar 14, 2017
at 1:37 AM siddharth anand <
>>>>>>>>> sanand@apache.org
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I'm going to deploy this
to staging now. Fab work Bolke!
>>>>>>>>>>>>>>>>> -s
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Mon, Mar 13, 2017
at 2:16 PM, Dan Davydov <
>>>>>>>>>>>>> dan.davydov@airbnb.com
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>> invalid
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I'll test this on
staging as soon as I get a chance (the
>>>>> testing
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> non-blocking on the
rc5). Bolke very much in particular
>> :).
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Mon, Mar 13, 2017
at 10:46 AM, Jeremiah Lowin <
>>>>>>>>>>>>>> jlowin@apache.org>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> +1 (binding)
extremely impressed by the work and
>> diligence
>>>>> all
>>>>>>>>>>>>>>>>>> contributors
>>>>>>>>>>>>>>>>>>> have put in to
getting these blockers fixed, Bolke in
>>>>>>>>>>>>> particular.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Mon, Mar 13,
2017 at 1:07 AM Arthur Wiedmer <
>>>>>>>>>>>>>> arthur@apache.org>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks again
for steering us through Bolke.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> Arthur
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Sun, Mar
12, 2017 at 9:59 PM, Bolke de Bruin <
>>>>>>>>>>>>>>> bdbruin@gmail.com
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Dear
All,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Finally,
I have been able to make the FIFTH RELEASE
>>>>>>>>>>>>> CANDIDATE
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>> Airflow
>>>>>>>>>>>>>>>>>>>>> 1.8.0
available at: https://dist.apache.org/repos/
>>>>>>>>>>>>>>>>>>>>> dist/dev/incubator/airflow/
<https://dist.apache.org/
>>>>>>>>>>>>>>>>>>>>> repos/dist/dev/incubator/airflow/>
, public keys are
>>>>>>>>>>>>>> available
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/
>>>>>>>>>>>>>> airflow/
>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/
>>>>>>>>>>>>>> airflow/>
>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>> tagged
with a local version “apache.incubating” so it
>>>>>>>>>>>>> allows
>>>>>>>>>>>>>>>>>> upgrading
>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>> earlier
releases.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Issues
fixed since rc4:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-900]
Double trigger should not kill original
>>>> task
>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-900]
Fixes bugs in LocalTaskJob for double run
>>>>>>>>>>>>>>>> protection
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-932]
Do not mark tasks removed when
>> backfilling
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-961]
run onkill when SIGTERMed
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-910]
Use parallel task execution for backfills
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-967]
Wrap strings in native for py2 ldap
>>>>>>>>>>>>>> compatibility
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-941]
Use defined parameters for psycopg2
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-719]
Prevent DAGs from ending prematurely
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-938]
Use test for True in task_stats queries
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-937]
Improve performance of task_stats
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-933]
use ast.literal_eval rather eval because
>>>>>>>>>>>>>>>>>> ast.literal_eval
>>>>>>>>>>>>>>>>>>>>> does
not execute input.
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-919]
Running tasks with no start date
>> shouldn't
>>>>>>>>>>>>>> break
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> DAGs
>>>>>>>>>>>>>>>>>>> UI
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-897]
Prevent dagruns from failing with
>>>> unfinished
>>>>>>>>>>>>>>> tasks
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-861]
make pickle_info endpoint be
>>>> login_required
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-853]
use utf8 encoding for stdout line decode
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-856]
Make sure execution date is set for local
>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-830][AIRFLOW-829][AIRFLOW-88]
Reduce Travis
>>>> log
>>>>>>>>>>>>>>>> verbosity
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-794]
Access DAGS_FOLDER and SQL_ALCHEMY_CONN
>>>>>>>>>>>>>>> exclusively
>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>> settings
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-694]
Fix config behaviour for empty envvar
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-365]
Set dag.fileloc explicitly and use for
>>>> Code
>>>>>>>>>>>>>> view
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-931]
Do not set QUEUED in TaskInstances
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-899]
Tasks in SCHEDULED state should be white
>>>> in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> UI
>>>>>>>>>>>>>>>>>>> instead
>>>>>>>>>>>>>>>>>>>>> of black
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-895]
Address Apache release incompliancies
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-893][AIRFLOW-510]
Fix crashing webservers when
>>>> a
>>>>>>>>>>>>>>> dagrun
>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>>>> start
date
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-793]
Enable compressed loading in
>>>>> S3ToHiveTransfer
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-863]
Example DAGs should have recent start
>>>> dates
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-869]
Refactor mark success functionality
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-856]
Make sure execution date is set for local
>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-814]
Fix Presto*CheckOperator.__init__
>>>>>>>>>>>>>>>>>>>>> [AIRFLOW-844]
Fix cgroups directory creation
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> No known
issues anymore.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I would
also like to raise a VOTE for releasing 1.8.0
>>>>> based
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> release
>>>>>>>>>>>>>>>>>>>>> candidate
5, i.e. just renaming release candidate 5 to
>>>>>>>>>>>>> 1.8.0
>>>>>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Please
respond to this email by:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> +1,0,-1
with *binding* if you are a PMC member or
>>>>>>>>>>>>>> *non-binding*
>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>> not.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>>> Bolke
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> My VOTE:
+1 (binding)
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> _/
>>>>>>>>>>>>>>>>> _/ Alex Van Boxel
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>> 


Mime
View raw message