airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel van der Ende (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-1658) Kill (possibly) still running Druid indexing job after max timeout is exceeded
Date Thu, 28 Sep 2017 18:04:00 GMT
Daniel van der Ende created AIRFLOW-1658:
--------------------------------------------

             Summary: Kill (possibly) still running Druid indexing job after max timeout is
exceeded
                 Key: AIRFLOW-1658
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1658
             Project: Apache Airflow
          Issue Type: Improvement
          Components: hooks
            Reporter: Daniel van der Ende
            Assignee: Daniel van der Ende
            Priority: Minor


Right now, the Druid hook contains a parameter max_ingestion_time. If the total execution
time of the Druid indexing job exceeds this timeout, an AirflowException is thrown. However,
this does not necessarily mean that the Druid task failed (a busy Hadoop cluster could also
be to blame for slow performance for example). If the Airflow task is then retried, you end
up with multiple Druid tasks performing the same work. 
To easily prevent this, we can call the shutdown endpoint on the task id that is still running.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message