airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1658) Kill (possibly) still running Druid indexing job after max timeout is exceeded
Date Mon, 02 Oct 2017 15:10:02 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188284#comment-16188284
] 

ASF subversion and git services commented on AIRFLOW-1658:
----------------------------------------------------------

Commit c61726288dcdb093c55a38faaf60aef020d0d3e0 in incubator-airflow's branch refs/heads/master
from [~danielvdende]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=c617262 ]

[AIRFLOW-1658] Kill Druid task on timeout

If the total execution time of a Druid task
exceeds the max timeout
defined, the Airflow task fails, but the Druid
task may still keep
running. This can cause undesired behaviour if
Airflow retries the
task. This patch calls the shutdown endpoint on
the Druid task to
kill any still running Druid task.

This commit also adds tests to ensure that all
mocked requests in
the Druid hook are actually called.

Closes #2644 from
danielvdende/kill_druid_task_on_timeout_exceeded


> Kill (possibly) still running Druid indexing job after max timeout is exceeded
> ------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-1658
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1658
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: hooks
>            Reporter: Daniel van der Ende
>            Assignee: Daniel van der Ende
>            Priority: Minor
>             Fix For: 1.9.0
>
>
> Right now, the Druid hook contains a parameter max_ingestion_time. If the total execution
time of the Druid indexing job exceeds this timeout, an AirflowException is thrown. However,
this does not necessarily mean that the Druid task failed (a busy Hadoop cluster could also
be to blame for slow performance for example). If the Airflow task is then retried, you end
up with multiple Druid tasks performing the same work. 
> To easily prevent this, we can call the shutdown endpoint on the task id that is still
running.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message