hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9370) Enable Hive on Spark for BigBench and run Query 8, the test failed [Spark Branch]
Date Fri, 16 Jan 2015 06:08:35 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279844#comment-14279844
] 

Sandy Ryza commented on HIVE-9370:
----------------------------------

That's correct that sortByKey is meant to launch a probe job to determine bounds of the partitions.
 This is similar to how the MR Terasort implementation works.

> Enable Hive on Spark for BigBench and run Query 8, the test failed [Spark Branch]
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-9370
>                 URL: https://issues.apache.org/jira/browse/HIVE-9370
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: yuyun.chen
>
> enable hive on spark and run BigBench Query 8 then got the following exception:
> 2015-01-14 11:43:46,057 INFO  [main]: impl.RemoteSparkJobStatus (RemoteSparkJobStatus.java:getSparkJobInfo(143))
- Job hasn't been submitted after 30s. Aborting it.
> 2015-01-14 11:43:46,061 INFO  [main]: impl.RemoteSparkJobStatus (RemoteSparkJobStatus.java:getSparkJobInfo(143))
- Job hasn't been submitted after 30s. Aborting it.
> 2015-01-14 11:43:46,061 ERROR [main]: status.SparkJobMonitor (SessionState.java:printError(839))
- Status: Failed
> 2015-01-14 11:43:46,062 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148))
- </PERFLOG method=SparkRunJob start=1421206996052 end=1421207026062 duration=30010 from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
- 15/01/14 11:43:46 INFO RemoteDriver: Failed to run job 0a9a7782-0e0b-4561-8468-959a6d8df0a3
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
- java.lang.InterruptedException
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at java.lang.Object.wait(Native Method)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at java.lang.Object.wait(Object.java:503)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:514)
> 2015-01-14 11:43:46,071 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1282)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1300)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1314)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1328)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.rdd.RDD.collect(RDD.scala:780)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:262)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.RangePartitioner.<init>(Partitioner.scala:124)
> 2015-01-14 11:43:46,072 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.rdd.OrderedRDDFunctions.sortByKey(OrderedRDDFunctions.scala:63)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:894)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.spark.api.java.JavaPairRDD.sortByKey(JavaPairRDD.scala:864)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.hadoop.hive.ql.exec.spark.SortByShuffler.shuffle(SortByShuffler.java:48)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.hadoop.hive.ql.exec.spark.ShuffleTran.transform(ShuffleTran.java:45)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.hadoop.hive.ql.exec.spark.SparkPlan.generateGraph(SparkPlan.java:69)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:223)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:298)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:269)
> 2015-01-14 11:43:46,073 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 2015-01-14 11:43:46,074 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 2015-01-14 11:43:46,074 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 2015-01-14 11:43:46,074 INFO  [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(436))
-        at java.lang.Thread.run(Thread.java:745)
> 2015-01-14 11:43:46,077 WARN  [RPC-Handler-3]: client.SparkClientImpl (SparkClientImpl.java:handle(407))
- Received result for unknown job 0a9a7782-0e0b-4561-8468-959a6d8df0a3
> 2015-01-14 11:43:46,091 ERROR [main]: ql.Driver (SessionState.java:printError(839)) -
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.spark.SparkTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message