hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <>
Subject [jira] [Commented] (HIVE-9970) Hive on spark
Date Wed, 25 Nov 2015 23:22:11 GMT


Xuefu Zhang commented on HIVE-9970:

[~tarushg], the error seems different from the original issue but very strange:
Caused by: Cannot run program "/home/adt/server/spark1.5/spark-1.5.1set
mapreduce.input.fileinputformat.split.maxsize=750000000set hive.vectorized.execution.enabled=trueset
hive.cbo.enable=trueset hive.optimize.reducededuplication.min.reducer=4set hive.optimize.reducededuplication=trueset
hive.orc.splits.include.file.footer=falseset hive.merge.mapfiles=trueset hive.merge.sparkfiles=falseset
hive.merge.smallfiles.avgsize=16000000set hive.merge.size.per.task=256000000set hive.merge.orcfile.stripe.level=trueset
hive.optimize.sort.dynamic.partition=falseset hive.stats.autogather=trueset hive.stats.fetch.column.stats=trueset
hive.vectorized.execution.reduce.enabled=falseset hive.vectorized.groupby.checkinterval=4096set
hive.vectorized.groupby.flush.percent=0.1set hive.compute.query.using.stats=trueset hive.limit.pushdown.memory.usage=0.4set
hive.optimize.index.filter=trueset hive.exec.reducers.bytes.per.reducer=67108864set hive.smbjoin.cache.rows=10000set
hive.exec.orc.default.stripe.size=67108864set hive.fetch.task.conversion=moreset hive.fetch.task.conversion.threshold=1073741824set
hive.fetch.task.aggr=falseset mapreduce.input.fileinputformat.list-status.num-threads=5set
error=36, File name too long
This is where Hive is building a process to launch a remote spark driver. it usually starts
with something like "/home/adt/server/spark1.5/bin/spark-submit ...". It seems that the builder
gets corrupted with a bunch of set commands. Could you describe how to reproduce this issue?

> Hive on spark
> -------------
>                 Key: HIVE-9970
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Amithsha
>            Assignee: Tarush Grover
> Hi all,
> Recently i have configured Spark 1.2.0 and my environment is hadoop
> 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing
> insert into i am getting the following g error.
> Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63
> Total jobs = 1
> Launching Job 1 out of 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapreduce.job.reduces=<number>
> Failed to execute spark task, with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create
> spark client.)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> Have added the spark-assembly jar in hive lib
> And also in hive console using the command add jar followed by the steps
> set spark.home=/opt/spark-1.2.1/;
> add jar /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar;
> set hive.execution.engine=spark;
> set spark.master=spark://xxxxxxx:7077;
> set spark.eventLog.enabled=true;
> set spark.executor.memory=512m;
> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
> Can anyone suggest!!!!
> Thanks & Regards
> Amithsha

This message was sent by Atlassian JIRA

View raw message