hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chengxiang Li (JIRA)" <>
Subject [jira] [Commented] (HIVE-7436) Load Spark configuration into Hive driver
Date Mon, 21 Jul 2014 06:32:38 GMT


Chengxiang Li commented on HIVE-7436:

[~xuefuz] Thanks for the comments. For the first question, default master/appname value should
bed added in case of missing spark-defaults.conf, i'll update patch later.
Second question: would user be able to set or change the spark configuration via hive's set
command? I guess not, but I'd like to hear your thought.
Here are some thoughts about this:
# Spark configurations is configured at application level, which means user can not reset
spark configurations dynamically during spark application. (Spark application lifecycle is
roughly same as the lifecycle of SparkContext instance)
# Change spark configuration via hive set command, means that Spark jobs which are represent
of different hive query commands must be submitted through different spark applications.
# Currently hive driver run queries in same Spark application(singleton SparkClient=>singleton

So mostly this question is depends on another one: should hive driver submit queries in a
singleton Spark application, or create separate Spark application for each query?
# For singleton spark application: little submit cost, fixed cluster resource in whole hive
driver lifecycle.
# For separate spark application on each query: more submit cost(config loading, dependencies
transformation, cluster resource allocation), dynamic resource application for each query.

Shark use singleton spark application, so its not resource efficient as it can not dynamicly
adjust assigned resources as required. What do you think about this?

> Load Spark configuration into Hive driver
> -----------------------------------------
>                 Key: HIVE-7436
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>         Attachments: HIVE-7436-Spark.1.patch
> load Spark configuration into Hive driver, there are 3 ways to setup spark configurations:
> #  Configure properties in spark configuration file(spark-defaults.conf).
> #  Java property.
> #  System environment.
> Spark support configuration through system environment just for compatible with previous
scripts, we won't support in Hive on Spark. Hive on Spark load defaults from java properties,
then load properties from configuration file, and override existed properties.
> configuration steps:
> # Create spark-defaults.conf, and place it in the /etc/spark/conf configuration directory.
>     please refer to [] for configuration
of spark-defaults.conf.
> # Create the $SPARK_CONF_DIR environment variable and set it to the location of spark-defaults.conf.
>     export SPARK_CONF_DIR=/etc/spark/conf
> # Add $SAPRK_CONF_DIR to the $HADOOP_CLASSPATH environment variable.
> NO PRECOMMIT TESTS. This is for spark-branch only.

This message was sent by Atlassian JIRA

View raw message