airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charlie (Jira)" <j...@apache.org>
Subject [jira] [Closed] (AIRFLOW-6194) Task instances aren't running after meeting dependencies
Date Sat, 07 Dec 2019 22:53:00 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Charlie closed AIRFLOW-6194.
----------------------------
    Resolution: Duplicate

> Task instances aren't running after meeting dependencies
> --------------------------------------------------------
>
>                 Key: AIRFLOW-6194
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6194
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DagRun, executors, scheduler, worker
>    Affects Versions: 1.10.6
>            Reporter: Charlie
>            Priority: Major
>
> We recently had an issue arise with our Airflow instance which caused the scheduler to
enter some sort of a deadlocked state in the middle of operation. In this state, all DAG runs
were listed as 'scheduled' and it didn't appear as if anything at all was happening.
> Initially, I thought this might be an issue with our configuration, but I couldn't quite
track down why this issue wouldn't have arisen earlier and, looking at the logs, I've been
seeing some strange behavior that I can't quite explain.
> The most notable thing is that, for whatever reason, the Executor Class listed under
all of our jobs is 'NoneType', previously 'LocalExecutor'. Looking at our logs, this change
initially happened when we updated our instance two days prior to this initial deadlock, however,
I have since cleared the database altogether and find that even starting from scratch, 'NoneType'
is appearing.
> In these same logs, I can see jobs continuously running for this DAG run, however the
start and end times are less than a second apart. At the same time, all task instances are
either listed a 'success' or 'scheduled' so I'm not entirely sure what the running jobs are. 
> If I look in the Task Instance Details for any of these scheduled tasks, I see 
> {code:java}
> All dependencies are met but the task instance is not running. In most cases this just
means that the task will probably be scheduled soon unless:
> - The scheduler is down or under heavy load
> If this task instance does not start soon please contact your Airflow administrator for
assistance.{code}
> Upon viewing the logs in the airflow for the scheduler, nothing seem awry.
> So to summarize, the scheduler seems to be doing it's job, as DAG runs are properly scheduled
and set as 'running' however the instances themselves are not completing properly. Due to
the listing of 'NoneType' instead of 'LocalExecutor' for the jobs, my theory is that there
is some issue with the LocalExecutor, that's causing it not properly execute jobs. Again,
clearing the database didn't seem to help this, and I now run into this deadlock almost immediately
with a test DAG I'm running.
> If I can provide any additional information, please let me know. I'd love to get this
resolved or figured out, as we're currently unable to use Airflow because of this.
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message