airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Davydov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1019) active_dagruns shouldn't include paused DAGs
Date Tue, 11 Apr 2017 22:57:41 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965082#comment-15965082
] 

Dan Davydov commented on AIRFLOW-1019:
--------------------------------------

Probably not a blocker, just a performance degradation.

> active_dagruns shouldn't include paused DAGs
> --------------------------------------------
>
>                 Key: AIRFLOW-1019
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1019
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 1.8.0
>            Reporter: Dan Davydov
>            Priority: Critical
>             Fix For: 1.8.1
>
>
> Since 1.8.0 Airflow resets orphaned tasks (tasks that are in the DB but not in the executor's
memory). The problem is that Airflow counts dagruns in paused DAGs as running as long as the
dagruns state is running. Instead we should join against non-paused DAGs everywhere we calculate
active dagruns (e.g. in _process_task_instances in the Scheduler class in jobs.py). If there
are enough paused DAGs it brings the scheduler to a halt especially on scheduler restarts.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message