airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (Jira)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-3149) GCP dataproc cluster creation should have the option to delete an ERROR cluster
Date Tue, 17 Sep 2019 10:47:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931286#comment-16931286
] 

ASF subversion and git services commented on AIRFLOW-3149:
----------------------------------------------------------

Commit 578c57f1ccac0ef8b5d17b0c6d7b0fa9accff8e2 in airflow's branch refs/heads/master from
Aaron Niskode-Dossett
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=578c57f ]

[AIRFLOW-3149] Support Dataproc cluster deletion on ERROR (#4064)



> GCP dataproc cluster creation should have the option to delete an ERROR cluster
> -------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-3149
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3149
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: gcp
>    Affects Versions: 1.10.0
>            Reporter: Aaron Dossett
>            Assignee: Aaron Dossett
>            Priority: Minor
>
> We sometimes encounter issues where a dataproc cluster creation ends up in ERROR state.
That is, the cluster “exists” but in the state of ERROR[1] (not just that the cluster
creation API call failed). This makes retries impossible since the cluster name already exists
subsequent retried creations are guaranteed to fail. 
> A `delete_cluster_on_error` parameter should be added to the `DataprocClusterCreateOperator`
operator that controls whether or not an attempt to delete an ERROR cluster is made.
>  
> [1] - I’ve seen that happen in two ways 1) a purely transient error from GCP `Internal
server error` or the like 2) when the request is rejected because it would exceed the project
quota.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Mime
View raw message