airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcin Szymanski (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-3065) Scheduler failing tasks when DAG concurrency limit reached
Date Mon, 17 Sep 2018 13:15:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617509#comment-16617509
] 

Marcin Szymanski commented on AIRFLOW-3065:
-------------------------------------------

Turned out to be caused by AIRFLOW-1104.

@commiters can we have the fix included in 1.10.1?

> Scheduler failing tasks when DAG concurrency limit reached
> ----------------------------------------------------------
>
>                 Key: AIRFLOW-3065
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3065
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.10.0
>            Reporter: Marcin Szymanski
>            Priority: Critical
>
> In a DAG with concurrency limit of 4, with about 150 task inside, when the limit of active
tasks is reached, the scheduler starts to fail queued tasks. They later are retried, but if
they have downstream tasks, these remain in upstream_failed status.
> A few additional details:
>  * celery executor
>  * environment upgraded from 1.9 (no issues back then)
>  * all configuration in airflow.cfg updated to the latest set of options
>  * issue happens both with PyPi 1.10 and a build from branch v1-10-test (c36ef06)
>  
>  
> {noformat}
> [2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for <TaskInstance:
consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for <TaskInstance:
consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>, dependency 'Task Instance
Slots Available' FAILED: The maximum number of running tasks (4) for this task's DAG 'consolidated_db'
has been reached.
> [2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
> --------------------------------------------------------------------------------
> FIXME: Rescheduling due to concurrency limits reached at task runtime. Attempt 1 of 1.
State set to NONE.
> --------------------------------------------------------------------------------
> [2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None
> [2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for <TaskInstance:
consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for <TaskInstance:
consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>, dependency 'Task Instance
Slots Available' FAILED: The maximum number of running tasks (4) for this task's DAG 'consolidated_db'
has been reached.
> [2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
> --------------------------------------------------------------------------------
> FIXME: Rescheduling due to concurrency limits reached at task runtime. Attempt 1 of 1.
State set to NONE.
> --------------------------------------------------------------------------------
> [2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None
> [2018-09-14 13:52:49,939] {models.py:1336} INFO - Dependencies all met for <TaskInstance:
consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:52:50,142] {models.py:1336} INFO - Dependencies all met for <TaskInstance:
consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:52:50,235] {models.py:1548} INFO - 
> --------------------------------------------------------------------------------
> Starting attempt 1 of 1
> --------------------------------------------------------------------------------
> [2018-09-14 13:52:50,646] {models.py:1570} INFO - Executing <Task(PostgresDumpOperator):
item> on 2018-09-14T12:42:55.379761+00:00
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message