airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Woods (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-2769) Increase num_retries polling value on Dataflow hook
Date Wed, 18 Jul 2018 21:36:00 GMT
Paul Woods created AIRFLOW-2769:
-----------------------------------

             Summary: Increase num_retries polling value on Dataflow hook
                 Key: AIRFLOW-2769
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2769
             Project: Apache Airflow
          Issue Type: Bug
          Components: contrib, Dataflow
    Affects Versions: 1.10
            Reporter: Paul Woods


*Problem Description*

When airflow launches a Job in Dataflow, it polls the GCP api for job status until the job
is complete or fails.  The GCP API occasionally returns 500 and 429  errors on these API
requests, which causes the airflow task to fail intermittently, particularly for long-running
tasks, while the dataflow job itself does not terminate.

The recommended action is to retry the request with exponential backoff ([https://developers.google.com/drive/api/v3/handle-errors)]. 
 The gcp api provides this service via the `num_retries` parameter on execute(), but that
parameter is not used in
{code:java}
airflow.contrib.hooks.gcp_dataflow_hook{code}
*Proposed Solution*

Add num_retries to the execute() calls in 
{code:java}
_DataflowJob._get_job_id_from_name{code}
and _
{code:java}
_DataflowJob._get_job{code}
 

*NOTE:*  the same problem was addressed for Dataproc in ([https://issues.apache.org/jira/browse/AIRFLOW-1718)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message