airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-41) SubdagOperators can oversubscribe to pools due to race condition
Date Tue, 29 Nov 2016 20:42:58 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706471#comment-15706471
] 

ASF subversion and git services commented on AIRFLOW-41:
--------------------------------------------------------

Commit d9bba86e9262e6133a8accf2d01dd94601c1579b in incubator-airflow's branch refs/heads/master
from [~g.toonstra]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=d9bba86 ]

[AIRFLOW-41] Fix pool oversubscription

Scheduler would send tasks to the queue for "open minus running"
instances. If the task eventually gets picked up and sees
(race condition, because multiple tasks could compete for slot)
that a slot is free, it would run the task. If the slot was not free,
the task would be set back to QUEUED (or SCHEDULED), anyway, returned
to the scheduler for another run. In specific cases, there'd be a
couple of task instances that suffer from the non-atomic read and
be run anyway.

Closes #1872 from gtoonstra/feature/AIRFLOW-41


> SubdagOperators can oversubscribe to pools due to race condition
> ----------------------------------------------------------------
>
>                 Key: AIRFLOW-41
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-41
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler, subdag
>    Affects Versions: Airflow 1.7.1
>            Reporter: Bolke de Bruin
>
> SubDagOperators essentially create their own mini-scheduler. Which can interfere with
the main scheduler. 
> SubdagOperators check if there is slot available in a Pool. However this slot is not
claimed at the same time leaving room for main scheduler to also check for the slot. Both
can then obtain a slot and thus oversubscribe
> A solution could be a centralized PoolHandler that gives out slots



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message