airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dylan Gorman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AIRFLOW-1965) Scheduler dies and subdag locked in running state when sub-subdag fails
Date Thu, 04 Jan 2018 18:39:00 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dylan Gorman updated AIRFLOW-1965:
----------------------------------
    Summary: Scheduler dies and subdag locked in running state when sub-subdag fails  (was:
Like Scheduler dies and subdag locked in running state when sub-subdag fails)

> Scheduler dies and subdag locked in running state when sub-subdag fails
> -----------------------------------------------------------------------
>
>                 Key: AIRFLOW-1965
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1965
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: 1.8.1
>            Reporter: Dylan Gorman
>         Attachments: subdagtest.py
>
>
> We have a problem with subdags getting locked in the paused state. This seems to happen
when the subdags contain another level of subdags which fail.
> The dag id corresponding to the subdag gets switched to paused in the database after
failure of the sub-subdag. The scheduler as well seems to die when this occurs (trying to
clear the task leaves it in the shutdown state). The SubDagOperator task stays in the running
state and never switches to failed. This seems to only happen when we set concurrency=1 in
the first level subdag.
> We attach a python file which should re-produce the behavior.
> In particular, our graph looks like this:
> Dummy Task_1 >> Subdag >> Dummy Task_2
> The subdag graph looks like:
> Dummy Task_Root >> SubSubdag_1 >> Dummy_Task_final
> Dummy_Task_Root >> SubSubdag_2 >> Dummy_Task_final
> SubSubdag_1 and _2 only contain a failing pythonoperator. When either of the SubSubdags
fails, the top level subdag gets then stuck in the paused state and does not allow execution
of the other subsubdag.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message