airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: Tasks Queued but never run
Date Wed, 07 Jun 2017 12:43:25 GMT
Issue is that you are hitting concurrency limits and the tasks set their own state to NONE
in that case (which they should not but that was a discussion earlier with Alex and Dan),
therefore they fall out of the list of tasks that need to be run.

Working on a patch for it.

Bolke

> On 7 Jun 2017, at 12:04, Bolke de Bruin <bdbruin@gmail.com> wrote:
> 
> I can confirm the issue (havent found the cause yet), but this is with BACKFILLS which
function independently
> from the scheduler. So restarting the scheduler will not help.
> 
> Bolke
> 
>> On 6 Jun 2017, at 19:35, Noah Yetter <noah@craftsy.com> wrote:
>> 
>> I'm experiencing the same issue. I've built a simple DAG with no external
>> dependencies other than bash that illustrates the problem consistently on
>> my machine, find it here:
>> https://gist.github.com/slotrans/b3e475c2b9789c4efc41876567902425
>> 
>> If you run it as e.g. airflow backfill tasks_never_run -s 2017-06-06 -e
>> 2017-06-06 you should see some tasks permanently remain in a state of "no
>> status". Restarting the scheduler will not help. Ctrl-C-ing the backfill
>> command and running it again *may* resolve it. The scheduler will
>> continually log messages like the following:
>> 
>> [2017-06-05 18:42:49,372] {jobs.py:1408} INFO - Heartbeating the process
>> manager
>> [2017-06-05 18:42:49,375] {dag_processing.py:559} INFO - Processor for
>> /Users/noah.yetter/airflow/dags/tasks_never_run.py finished
>> [2017-06-05 18:42:49,428] {jobs.py:1007} INFO - Tasks up for execution:
>> <TaskInstance: tasks_never_run.t_leaf_five__axe 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_five__arrows 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_two__determined 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_two__beyond 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_two__axe 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_two__arrows 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_three__determined 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_three__beyond 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_three__arrows 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_one__beyond 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_one__arrows 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_four__determined 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_four__arrows 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_five__beyond 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_one__determined 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_one__axe 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_five__determined 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_three__axe 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_four__axe 2017-06-05 00:00:00
>> [scheduled]>
>> <TaskInstance: tasks_never_run.t_leaf_four__beyond 2017-06-05 00:00:00
>> [scheduled]>
>> [2017-06-05 18:42:49,430] {jobs.py:1030} INFO - Figuring out tasks to run
>> in Pool(name=None) with 128 open slots and 20 task instances in queue
>> [2017-06-05 18:42:49,463] {jobs.py:1444} INFO - Heartbeating the executor
>> 
>> 
>> On 2017-06-01 15:18 (-0600), "Josef Sa...@gmail.com> wrote:
>>> Hi!>
>>> 
>>> We have a problem with our airflow. Sometimes, several tasks get queued
>> but they never get run and remain in Queud state forever. Other tasks from
>> the same schedule interval run. And next schedule interval runs normally
>> too. But these several tasks remain queued.>
>>> 
>>> We are using Airflow 1.8.1. Currently with CeleryExecutor and redis, but
>> we had the same problem with LocalExecutor as well (actually switching to
>> Celery helped quite a bit, the problem now happens way less often, but
>> still it happens). We have 18 DAGs total, 13 active. Some have just 1-2
>> tasks, but some are more complex, like 8 tasks or so and with upstreams.
>> There are also ExternalTaskSensor tasks used. >
>>> 
>>> I tried playing around with DAG configurations (limiting concurrency,
>> max_active_runs, ...), tried switching off some DAGs completely (not all
>> but most) etc., so far nothing helped. Right now, I am not really sure,
>> what else to try to identify a solve the issue.>
>>> 
>>> I am getting a bit desperate, so I would really appreciate any help with
>> this. Thank you all in advance!>
>>> 
>>> Joe>
>>> 
> 


Mime
View raw message