airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Payne (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-3035) gcp_dataproc_hook should treat CANCELLED job state consistently
Date Mon, 10 Sep 2018 22:38:00 GMT
Jeffrey Payne created AIRFLOW-3035:
--------------------------------------

             Summary: gcp_dataproc_hook should treat CANCELLED job state consistently
                 Key: AIRFLOW-3035
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3035
             Project: Apache Airflow
          Issue Type: Bug
          Components: contrib
    Affects Versions: 1.10.0, 2.0.0, 1.10.1
            Reporter: Jeffrey Payne


When a DP job is cancelled, {{gcp_dataproc_hook.py}} does not treat the {{CENCELLED}} state
in a consistent and non-intuitive manner:
# The API internal to {{gcp_dataproc_hook.py}} returns {{False}} from {{_DataProcJob.wait_for_done()}},
resulting in {{raise_error()}} being called for cancelled jobs, yet {{raise_error()}} only
raises {{Exception}} if the job state is {{ERROR}}.
# The end result from the perspective of the {{dataproc_operator.py}} for a cancelled job
is that the job succeeded, which results in the success callback being called.  This seems
strange to me, as a "cancelled" job is rarely considered successful, in my experience.

Simply changing {{raise_error()}} from:
{code:python}
        if 'ERROR' == self.job['status']['state']:
{code}
to
{code:python}
        if self.job['status']['state'] in ('ERROR', 'CENCELLED'):
{code}
would fix both of these...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message