hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <>
Subject [jira] [Commented] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit
Date Wed, 19 Apr 2017 22:52:05 GMT


Marcelo Vanzin commented on HIVE-16484:

{{SparkLauncher}} is just a wrapper around spark-submit, currently. It adds some nice APIs
on top of it, but it still requires a SPARK_HOME and everything else. Once SPARK-11035 is
fixed then you can avoid the dependency on spark-submit (the shell script), but you'd still
need most of the other things you already need (like a proper configuration).

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> --------------------------------------------------------------------
>                 Key: HIVE-16484
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} directory
and invokes the {{bin/spark-submit}} script, which spawns a separate process to run the Spark
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which contains some
useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners

This message was sent by Atlassian JIRA

View raw message