hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9136) Profile query compiler [Spark Branch]
Date Thu, 18 Dec 2014 04:03:15 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251139#comment-14251139
] 

Chao commented on HIVE-9136:
----------------------------

groupby4, auto_join0 and multigroupby_singlemr passed on my computer. I think the test failure
is caused by this:

{noformat}
2014-12-17 18:10:23,142 WARN  [main]: client.SparkClientImpl (SparkClientImpl.java:<init>(88))
- Error while waiting for client to connect.
java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out
waiting for client connection.
	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
	at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:86)
	at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:76)
	at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:81)
	at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:53)
	at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:56)
	at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:128)
	at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:87)
	at org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:116)
	at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
	at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
	at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:137)
	at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
	at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
....
2014-12-17 18:10:23,142 WARN  [main]: client.SparkClientImpl (SparkClientImpl.java:<init>(88))
- Error while waiting for client to connect.
java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out
waiting for client connection.
	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
	at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:86)
	at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:76)
	at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:81)
	at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:53)
	at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:56)
	at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:128)
	at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:87)
	at org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:116)
	at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
	at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
	at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:137)
	at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
	at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
{noformat}


> Profile query compiler [Spark Branch]
> -------------------------------------
>
>                 Key: HIVE-9136
>                 URL: https://issues.apache.org/jira/browse/HIVE-9136
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Brock Noland
>            Assignee: Chao
>         Attachments: HIVE-9136.1-spark.patch, HIVE-9136.1.patch, HIVE-9136.2-spark.patch
>
>
> We should put some performance counters around the compiler and evaluate how long it
takes to compile a query in Spark versus the other execution frameworks. Query 28 is a good
one to use for testing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message