hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chengxiang li" <chengxiang...@intel.com>
Subject Re: Review Request 30055: HIVE-9337 : Move more hive.spark.* configurations to HiveConf
Date Tue, 20 Jan 2015 03:33:55 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30055/#review68686
-----------------------------------------------------------


Szehon, do you try to transfer RSC configuration through RemoteDriver --conf option? The command
generated in SparkClientImpl should looks like: 
SparkSubmit --properties-file /tmp/spark-submit.1267525585014474423.properties --class org.apache.hive.spark.client.RemoteDriver
/usr/lib/hive-0.15.0/lib/hive-exec-0.15.0-SNAPSHOT.jar --remote-host node14-4 --remote-port
38136 --conf hive.spark....=... --conf hive.spark...=...
It's quite strage that SparkSubmit handle it's child main class's arguments.

- chengxiang li


On Jan. 19, 2015, 11:16 p.m., Szehon Ho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30055/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2015, 11:16 p.m.)
> 
> 
> Review request for hive and chengxiang li.
> 
> 
> Bugs: HIVE-9337
>     https://issues.apache.org/jira/browse/HIVE-9337
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change allows the Remote Spark Driver's properties to be set dynamically via Hive
configuration (ie, set commands).
> 
> Went through the Remote Spark Driver's properties and added them to HiveConf, fixing
the descriptions so that they're more clear in a global context with other Hive properties.
 Also fixed a bug in description that stated default value of max message size is 10MB, should
read 50MB.  One open question is that I did not move 'hive.spark.log.dir' as I could not find
where it was read, and did not know if its still being used somewhere?
> 
> The passing of these properties between client (Hive) and RemoteSparkDriver is done via
the properties file.  One note is that these properties have to be appended with 'spark',
as SparkConf only accepts those.  I tried a long time to pass them via 'conf' but found that
it won't work (see SparkSubmitArguments.scala).  It may be possible to pass them each as another
argument (like --hive.spark.XXX=YYY), but I think its more scalable to do it via properties
file.
> 
> On the Remote Spark Driver side, I kept the defensive logic to provide a default value
in case the conf object doesn't contain the property.  This may occur if a prop is unset.
For this, I had to instantiate a HiveConf on that process to get the default value, as some
of the timeout props need a hiveConf instance to do calculation on.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 068c962 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 334c191

>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 865e03e

>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java ac71ae9 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java 5a826ba

>   spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java a2dd3e6 
> 
> Diff: https://reviews.apache.org/r/30055/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message