hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15912) Executor kill task and Failed to get spark memory/core info
Date Mon, 13 Mar 2017 09:00:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15907035#comment-15907035
] 

Rui Li commented on HIVE-15912:
-------------------------------

[~yiyao], is that all the redirector log you have? Usually in yarn-client mode, the redirector
should output quite a lot stuff because all the RemoteDriver log goes to the redirector. You
should find something like creating SparkContext, starting all the schedulers in spark, etc.
Even for the specific log you posted, it should have come with an exception: https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java#L678.
Btw, if you're using an old version of HoS, I'd suggest you try the case in a newer one.

> Executor kill task and Failed to get spark memory/core info
> -----------------------------------------------------------
>
>                 Key: HIVE-15912
>                 URL: https://issues.apache.org/jira/browse/HIVE-15912
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Spark
>    Affects Versions: 2.2.0
>         Environment: hadoop2.7.1
> spark2.0.2
> Hive2.2
>            Reporter: KaiXu
>
> Hive on Spark, failed with error:
> Starting Spark Job = 12a8cb8c-ed0d-4049-ae06-8d32d13fe285
> Failed to monitor Job[ 6] with exception 'java.lang.IllegalStateException(RPC channel
is closed.)'
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
> Hive's log:
> 2017-02-14T19:03:09,147  INFO [stderr-redir-1] client.SparkClientImpl: 17/02/14 19:03:09
INFO yarn.Client: Application report for application_1486905599813_0403 (state: ACCEPTED)
> 2017-02-14T19:03:10,817  WARN [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] spark.SetSparkReducerParallelism:
Failed to get spark memory/core info
> java.util.concurrent.TimeoutException
>         at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
>         at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:155)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:165)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getMemoryAndCores(SparkSessionImpl.java:77)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:119)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:291)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:120)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11085)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:279)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:510) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1302) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1442) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1222) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:430) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:749) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:715) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:642) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_60]
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_60]
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_60]
>         at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221) ~[hadoop-common-2.7.1.jar:?]
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136) ~[hadoop-common-2.7.1.jar:?]
> 2017-02-14T19:03:10,817  INFO [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] spark.SetSparkReducerParallelism:
Set parallelism for reduce sink RS[24] to: 1 (calculated)
> yarn log:
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 329.0 in stage 15.0 (TID 3865).
3228 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 201.0 in stage 15.0 (TID 3737).
3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 9.0 in stage 15.0 (TID 3545).
4027 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 457.0 in stage 15.0 (TID 3993).
3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4221
> 17/02/14 19:05:36 INFO executor.Executor: Running task 687.0 in stage 15.0 (TID 4221)
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 265.0 in stage 15.0 (TID 3801).
3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4240
> 17/02/14 19:05:36 INFO executor.Executor: Running task 692.0 in stage 15.0 (TID 4240)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4242
> 17/02/14 19:05:36 INFO executor.Executor: Running task 763.0 in stage 15.0 (TID 4242)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4243
> 17/02/14 19:05:36 INFO executor.Executor: Running task 836.0 in stage 15.0 (TID 4243)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4244
> 17/02/14 19:05:36 INFO executor.Executor: Running task 863.0 in stage 15.0 (TID 4244)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4248
> 17/02/14 19:05:36 INFO executor.Executor: Running task 979.0 in stage 15.0 (TID 4248)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4280
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_687 locally
> 17/02/14 19:05:36 INFO executor.Executor: Running task 990.0 in stage 15.0 (TID 4280)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 687.0 in stage
15.0 (TID 4221)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 73.0 in stage
15.0 (TID 3609)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 763.0 in stage
15.0 (TID 4242)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 979.0 in stage
15.0 (TID 4248)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 990.0 in stage
15.0 (TID 4280)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 863.0 in stage
15.0 (TID 4244)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 836.0 in stage
15.0 (TID 4243)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 692.0 in stage
15.0 (TID 4240)
> 17/02/14 19:05:36 INFO storage.BlockManager: Removing RDD 24
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 73.0 in stage 15.0 (TID 3609).
3214 bytes result sent to driver
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_863 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_692 locally
> 17/02/14 19:05:36 ERROR executor.Executor: Exception in task 990.0 in stage 15.0 (TID
4280)
> org.apache.spark.TaskKilledException
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:264)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 687.0 in stage 15.0 (TID
4221)
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_836 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_979 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_763 locally
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 863.0 in stage 15.0 (TID
4244)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 979.0 in stage 15.0 (TID
4248)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 836.0 in stage 15.0 (TID
4243)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 692.0 in stage 15.0 (TID
4240)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 763.0 in stage 15.0 (TID
4242)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver from 192.168.1.1:35981
disconnected during shutdown
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver from 192.168.1.1:35981
disconnected during shutdown
> 17/02/14 19:05:36 INFO memory.MemoryStore: MemoryStore cleared
> 17/02/14 19:05:36 INFO storage.BlockManager: BlockManager stopped
> 17/02/14 19:05:36 INFO util.ShutdownHookManager: Shutdown hook called



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message