airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-678) Task instances can double-trigger
Date Mon, 12 Dec 2016 20:10:58 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15743009#comment-15743009
] 

ASF subversion and git services commented on AIRFLOW-678:
---------------------------------------------------------

Commit 15ff540ecd5e60e7ce080177ea3ea227582a4672 in incubator-airflow's branch refs/heads/master
from [~aoen]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=15ff540 ]

[AIRFLOW-678] Prevent scheduler from double triggering TIs

At the moment there is no lock/synchronization
around the loop where the scheduler puts tasks in
the SCHEDULED state. This means that if somehow
the task starts running or gets SCHEDULED
somewhere else somehow (e.g. manually running a
task via the webserver) the task can have it's
state changed from RUNNING/QUEUED to SCHEDULED
which can cause a single task instance to be run
twice at the same time.

Testing Done:
- Tested this branch on the Airbnb Airflow staging
cluster
- Airbnb has been running very similar logic in
our production for many months (not 1-1 since we
are still running off of the last release branch)
- In the future we ideally need an integration
test to catch double triggers but this is not
trivial to do properly

Closes #1924 from
aoen/ddavydov/fix_scheduler_race_condition


> Task instances can double-trigger
> ---------------------------------
>
>                 Key: AIRFLOW-678
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-678
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Dan Davydov
>            Assignee: Dan Davydov
>
> At the moment there is no lock/synchronization around the loop where the scheduler puts
tasks in the SCHEDULED state. This means that if somehow the task starts running or gets SCHEDULED
somewhere else somehow (e.g. manually running a task via the webserver) the task can have
it's state changed from RUNNING/QUEUED to SCHEDULED which can cause a single task instance
to be run twice at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message