hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yi Yao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15912) Executor kill task and Failed to get spark memory/core info
Date Mon, 13 Mar 2017 01:20:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906767#comment-15906767
] 

Yi Yao commented on HIVE-15912:
-------------------------------

Hi [~lirui]], I checked the earlier log, there's only 1 timeout warning.  I have specified
1 hours as Spark network timeout threshold. I did not find any network issues in my OS log.
Not sure what causes the issue.

2017-03-09 17:23:29,761 WARN [main]: impl.RemoteSparkJobStatus (RemoteSparkJobStatus.java:getSparkStageInfo(162))
- Error getting stage info
java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageInfo(RemoteSparkJobStatus.java:160)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageProgress(RemoteSparkJobStatus.java:96)

> Executor kill task and Failed to get spark memory/core info
> -----------------------------------------------------------
>
>                 Key: HIVE-15912
>                 URL: https://issues.apache.org/jira/browse/HIVE-15912
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Spark
>    Affects Versions: 2.2.0
>         Environment: hadoop2.7.1
> spark2.0.2
> Hive2.2
>            Reporter: KaiXu
>
> Hive on Spark, failed with error:
> Starting Spark Job = 12a8cb8c-ed0d-4049-ae06-8d32d13fe285
> Failed to monitor Job[ 6] with exception 'java.lang.IllegalStateException(RPC channel
is closed.)'
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
> Hive's log:
> 2017-02-14T19:03:09,147  INFO [stderr-redir-1] client.SparkClientImpl: 17/02/14 19:03:09
INFO yarn.Client: Application report for application_1486905599813_0403 (state: ACCEPTED)
> 2017-02-14T19:03:10,817  WARN [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] spark.SetSparkReducerParallelism:
Failed to get spark memory/core info
> java.util.concurrent.TimeoutException
>         at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
>         at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:155)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:165)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getMemoryAndCores(SparkSessionImpl.java:77)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:119)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:291)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:120)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11085)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:279)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:510) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1302) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1442) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1222) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:430) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:749) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:715) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:642) ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_60]
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_60]
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_60]
>         at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221) ~[hadoop-common-2.7.1.jar:?]
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136) ~[hadoop-common-2.7.1.jar:?]
> 2017-02-14T19:03:10,817  INFO [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] spark.SetSparkReducerParallelism:
Set parallelism for reduce sink RS[24] to: 1 (calculated)
> yarn log:
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 329.0 in stage 15.0 (TID 3865).
3228 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 201.0 in stage 15.0 (TID 3737).
3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 9.0 in stage 15.0 (TID 3545).
4027 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 457.0 in stage 15.0 (TID 3993).
3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4221
> 17/02/14 19:05:36 INFO executor.Executor: Running task 687.0 in stage 15.0 (TID 4221)
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 265.0 in stage 15.0 (TID 3801).
3141 bytes result sent to driver
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4240
> 17/02/14 19:05:36 INFO executor.Executor: Running task 692.0 in stage 15.0 (TID 4240)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4242
> 17/02/14 19:05:36 INFO executor.Executor: Running task 763.0 in stage 15.0 (TID 4242)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4243
> 17/02/14 19:05:36 INFO executor.Executor: Running task 836.0 in stage 15.0 (TID 4243)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4244
> 17/02/14 19:05:36 INFO executor.Executor: Running task 863.0 in stage 15.0 (TID 4244)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4248
> 17/02/14 19:05:36 INFO executor.Executor: Running task 979.0 in stage 15.0 (TID 4248)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 4280
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_687 locally
> 17/02/14 19:05:36 INFO executor.Executor: Running task 990.0 in stage 15.0 (TID 4280)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 687.0 in stage
15.0 (TID 4221)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 73.0 in stage
15.0 (TID 3609)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 763.0 in stage
15.0 (TID 4242)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 979.0 in stage
15.0 (TID 4248)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 990.0 in stage
15.0 (TID 4280)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 863.0 in stage
15.0 (TID 4244)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 836.0 in stage
15.0 (TID 4243)
> 17/02/14 19:05:36 INFO executor.Executor: Executor is trying to kill task 692.0 in stage
15.0 (TID 4240)
> 17/02/14 19:05:36 INFO storage.BlockManager: Removing RDD 24
> 17/02/14 19:05:36 INFO executor.Executor: Finished task 73.0 in stage 15.0 (TID 3609).
3214 bytes result sent to driver
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_863 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_692 locally
> 17/02/14 19:05:36 ERROR executor.Executor: Exception in task 990.0 in stage 15.0 (TID
4280)
> org.apache.spark.TaskKilledException
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:264)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 687.0 in stage 15.0 (TID
4221)
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_836 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_979 locally
> 17/02/14 19:05:36 INFO storage.BlockManager: Found block rdd_24_763 locally
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 863.0 in stage 15.0 (TID
4244)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 979.0 in stage 15.0 (TID
4248)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 836.0 in stage 15.0 (TID
4243)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 692.0 in stage 15.0 (TID
4240)
> 17/02/14 19:05:36 INFO executor.Executor: Executor killed task 763.0 in stage 15.0 (TID
4242)
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver from 192.168.1.1:35981
disconnected during shutdown
> 17/02/14 19:05:36 INFO executor.CoarseGrainedExecutorBackend: Driver from 192.168.1.1:35981
disconnected during shutdown
> 17/02/14 19:05:36 INFO memory.MemoryStore: MemoryStore cleared
> 17/02/14 19:05:36 INFO storage.BlockManager: BlockManager stopped
> 17/02/14 19:05:36 INFO util.ShutdownHookManager: Shutdown hook called



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message