airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Minton <paul.min...@cloverhealth.com.INVALID>
Subject Re: Task instance enqueued but never runs, holds up next run of the DAG
Date Wed, 01 Mar 2017 19:21:16 GMT
Could the problem be related to this note in the "updating" docs?

https://github.com/apache/incubator-airflow/blob/1.8.0/UPDATING.md#tasks-not-starting-although-dependencies-are-met-due-to-stricter-pool-checking

On Tue, Feb 28, 2017 at 10:52 AM, Vijay Ramesh <vijay@change.org> wrote:

> I have a large DAG (32 tasks) with concurrency=2 and max_active_runs=1.
> Most of the tasks also use a redshift_pool, and this is running the
> LocalExecutor on 1.8.0RC4.
>
> When the DAG kicks off things seem to generally function, but a few of the
> tasks get moved to queued status (appropriately) but then never actually
> start.  Looking in the logs I see:
>
> [2017-02-27 13:20:10,349] {base_task_runner.py:95} INFO - Subtask:
> [2017-02-27 13:20:10,348] {models.py:1128} INFO - Dependencies all met for
> <TaskInstance: etl_queries_v3.a_user_day_v2_query 2017-02-26 07:00:00
> [queued]>
> [2017-02-27 13:20:10,356] {base_task_runner.py:95} INFO - Subtask:
> [2017-02-27 13:20:10,356] {models.py:1122} INFO - Dependencies not met for
> <TaskInstance: etl_queries_v3.a_user_day_v2_query 2017-02-26 07:00:00
> [queued]>, dependency 'Task Instance Slots Available' FAILED: The maximum
> number of running tasks (etl_queries_v3) for this task's DAG '2' has been
> reached.
> [2017-02-27 13:20:14,444] {jobs.py:2062} INFO - Task exited with return
> code 0
>
> and then that's it, the queued task never is picked up again. It has been
> different tasks each day, which makes me suspect it's some sort of
> scheduling race condition.  And because they are enqueued not failed, the
> DAG run never finishes (and so this morning our DAG didn't kick off because
> yesterday's was still technically "running").
>
> Any thoughts/advice? (I also added
> https://github.com/apache/incubator-airflow/pull/2109 to fix the
> formatting
> of that error message)
>
> Thanks,
>  - Vijay Ramesh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message