airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felipe Elias Lolas Isuani <felipe.elias...@gmail.com>
Subject [DISCUSS][WIP][AIRFLOW-3863] Make SparkSubmitHook capable of executing spark-submit through SSH Connection
Date Tue, 19 Feb 2019 02:29:44 GMT
Hi!

Im currently working in adding SSH super-powers to SparkSubmitOperator. It’s really simple,
using SshHook and a simple wrapper of the connection to mimic Popen interface. We are using
internally in our company, because we have different secured spark clusters, with different
software version and would be really difficult to manage with an installation of airflow worker
in every cluster or installing spark-submit binary into airflow worker. I think this is a
common problem.

 I wanna know if someone want this kind of feature, if so, I can continue the work with tests
and documentation and making a PR. Plus, hearing ideas, concerns, etc… of this approach.
I will be happy to hear feedback from Airflow community.

The WIP code is available in https://github.com/flolas/airflow/blob/5bc837a03d226718f78eecbf4c637de222280adc/airflow/contrib/hooks/spark_submit_hook.py
<https://github.com/flolas/airflow/blob/5bc837a03d226718f78eecbf4c637de222280adc/airflow/contrib/hooks/spark_submit_hook.py>



Cheers,

Felipe L.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message