airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis Muth (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-1327) LocalExecutor won't reschedule on concurrency limit hit
Date Tue, 20 Jun 2017 13:53:00 GMT
Dennis Muth created AIRFLOW-1327:
------------------------------------

             Summary: LocalExecutor won't reschedule on concurrency limit hit
                 Key: AIRFLOW-1327
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1327
             Project: Apache Airflow
          Issue Type: Bug
          Components: scheduler
    Affects Versions: 1.8.1
         Environment: LocalExecutor
            Reporter: Dennis Muth
         Attachments: Airflow_logs.png, ti_unscheduled.png

For several days we are trying to migrate from airflow 1.7.1.3 to 1.8.1.
Unfortunately we ran into a serious issue that seems to be scheduler related (we are using
the LocalExecutor one).

When running a SubDag some Task instances get queued (queues are defined), switch to running
and some time later finish. Well, thats how it should be.
But: Some task instances get queued up, print some cryptic warning message (we get to this
in a sec) and then get no state (NONE).

The warning message:
{code}
FIXME: Rescheduling due to concurrency limits reached at task runtime. Attempt 1 of 2. State
set to NONE.
{code}

This suggests that a limit is too low and that this instance will be picked up later by the
scheduler for processing, when there are probably more slots available. 
We have waited for quite some time now, but the task is not re-scheduled.

When I rerun the subdag some previous failed task instances (state = None) will now succeed,
but other - previously successful ones - will fail. Weird...

I've attached some screenshots to make this more transparent to you, too.
Is this a bug or just on purpose? Do we need to switch to the CeleryExecutor?

Please do not hesitate if you need additional logs or other stuff.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message