airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Payne (JIRA)" <>
Subject [jira] [Updated] (AIRFLOW-3035) gcp_dataproc_hook should treat CANCELLED job state consistently
Date Tue, 11 Sep 2018 00:41:00 GMT


Jeffrey Payne updated AIRFLOW-3035:
    Priority: Minor  (was: Major)

> gcp_dataproc_hook should treat CANCELLED job state consistently
> ---------------------------------------------------------------
>                 Key: AIRFLOW-3035
>                 URL:
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib
>    Affects Versions: 1.10.0, 2.0.0, 1.10.1
>            Reporter: Jeffrey Payne
>            Assignee: Jeffrey Payne
>            Priority: Minor
>              Labels: dataproc
> When a DP job is cancelled, {{}} does not treat the {{CENCELLED}}
state in a consistent and non-intuitive manner:
> # The API internal to {{}} returns {{False}} from {{_DataProcJob.wait_for_done()}},
resulting in {{raise_error()}} being called for cancelled jobs, yet {{raise_error()}} only
raises {{Exception}} if the job state is {{ERROR}}.
> # The end result from the perspective of the {{}} for a cancelled
job is that the job succeeded, which results in the success callback being called.  This seems
strange to me, as a "cancelled" job is rarely considered successful, in my experience.
> Simply changing {{raise_error()}} from:
> {code:python}
>         if 'ERROR' == self.job['status']['state']:
> {code}
> to
> {code:python}
>         if self.job['status']['state'] in ('ERROR', 'CANCELLED'):
> {code}
> would fix both of these...
> Another, perhaps better, option would be to have the dataproc job operators accept a
list of {{error_states}} that could be passed into {{raise_error()}}, allowing the caller
to determine which states should result in "failure" of the task.  I would lean towards that

This message was sent by Atlassian JIRA

View raw message