airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milan van der Meer (JIRA)" <>
Subject [jira] [Work started] (AIRFLOW-1854) Improve Spark submit hook for cluster mode
Date Tue, 28 Nov 2017 10:12:00 GMT


Work on AIRFLOW-1854 started by Milan van der Meer.
> Improve Spark submit hook for cluster mode
> ------------------------------------------
>                 Key: AIRFLOW-1854
>                 URL:
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: hooks
>            Reporter: Milan van der Meer
>            Assignee: Milan van der Meer
>            Priority: Minor
>              Labels: features
> *We are already working on this issue and making a PR soon*
> When executing a Spark submit to a standalone cluster using the Spark submit hook, it
will get a return code from the Spark submit action and not the Spark job itself.
> This means when a Spark submit is executed and successfully received by the cluster,
the Airflow job will be successful, even when the Spark job fails on the cluster later on.
> Suggested solution:
> * When you execute a Spark submit, the response will contain a driver ID.
> * Use this driver ID to poll the cluster for the driver state.
> * Based on the drivers state, the job will be successful or failed.

This message was sent by Atlassian JIRA

View raw message