airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Yang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-374) Kill task instances that haven't been able to heartbeat for a while
Date Thu, 28 Jul 2016 01:02:20 GMT
Paul Yang created AIRFLOW-374:
---------------------------------

             Summary: Kill task instances that haven't been able to heartbeat for a while
                 Key: AIRFLOW-374
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-374
             Project: Apache Airflow
          Issue Type: Improvement
          Components: operators
            Reporter: Paul Yang
            Assignee: Paul Yang
             Fix For: Airflow 1.8


A task run by the LocalTaskJob periodically updates a timestamp to indicate that the task
is still alive and running. If the task is unable to update this timestamp for a long time
(for example, due to DB connection errors), the scheduler may reschedule the task to run again.
In such a case, it's possible that two instances of the task are running. The task can monitor
the time since last heartbeat and kill itself to prevent such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message