hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jone Zhang <joyoungzh...@gmail.com>
Subject Hive on Spark task running time is too long
Date Mon, 11 Jan 2016 07:21:47 GMT
*I have submited a application many times.*
*Most of applications running correctly.See attach 1.*
*But one of the them breaks as expected.See attach 2.1 and 2.2.*

*Why a small data size task running so long, and can't find any helpful
information in yarn logs.*

*Part of the log information is as follows*
16/01/11 12:45:19 INFO storage.BlockManagerMasterEndpoint: Trying to remove
executor 1 from BlockManagerMaster.
16/01/11 12:45:19 INFO storage.BlockManagerMasterEndpoint: Removing block
manager BlockManagerId(1, 10.226.148.160, 44366)
16/01/11 12:45:19 INFO storage.BlockManagerMaster: Removed 1 successfully
in removeExecutor
16/01/11 12:50:32 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0
on 10.219.58.123:39594 in memory (size: 92.2 KB, free: 441.4 MB)
16/01/11 12:55:20 WARN spark.HeartbeatReceiver: Removing executor 2 with no
recent heartbeats: 604535 ms exceeds timeout 600000 ms
16/01/11 12:55:20 ERROR cluster.YarnClusterScheduler: Lost an executor 2
(already removed): Executor heartbeat timed out after 604535 ms
16/01/11 12:55:20 WARN spark.HeartbeatReceiver: Removing executor 1 with no
recent heartbeats: 609228 ms exceeds timeout 600000 ms
16/01/11 12:55:20 ERROR cluster.YarnClusterScheduler: Lost an executor 1
(already removed): Executor heartbeat timed out after 609228 ms
16/01/11 12:55:20 WARN spark.HeartbeatReceiver: Removing executor 4 with no
recent heartbeats: 615098 ms exceeds timeout 600000 ms
16/01/11 12:55:20 ERROR cluster.YarnClusterScheduler: Lost an executor 4
(already removed): Executor heartbeat timed out after 615098 ms
16/01/11 12:55:20 WARN spark.HeartbeatReceiver: Removing executor 3 with no
recent heartbeats: 616730 ms exceeds timeout 600000 ms
16/01/11 12:55:20 INFO cluster.YarnClusterSchedulerBackend: Requesting to
kill executor(s) 2
16/01/11 12:55:20 ERROR cluster.YarnClusterScheduler: Lost an executor 3
(already removed): Executor heartbeat timed out after 616730 ms
16/01/11 12:55:20 WARN cluster.YarnClusterSchedulerBackend: Executor to
kill 2 does not exist!
16/01/11 12:55:20 INFO yarn.ApplicationMaster$AMEndpoint: Driver requested
to kill executor(s) .
16/01/11 12:55:20 INFO cluster.YarnClusterSchedulerBackend: Requesting to
kill executor(s) 1
16/01/11 12:55:20 WARN cluster.YarnClusterSchedulerBackend: Executor to
kill 1 does not exist!
16/01/11 12:55:20 INFO yarn.ApplicationMaster$AMEndpoint: Driver requested
to kill executor(s) .
16/01/11 12:55:20 INFO cluster.YarnClusterSchedulerBackend: Requesting to
kill executor(s) 4
16/01/11 12:55:20 WARN cluster.YarnClusterSchedulerBackend: Executor to
kill 4 does not exist!
16/01/11 12:55:20 INFO yarn.ApplicationMaster$AMEndpoint: Driver requested
to kill executor(s) .
16/01/11 12:55:20 INFO cluster.YarnClusterSchedulerBackend: Requesting to
kill executor(s) 3
16/01/11 12:55:20 WARN cluster.YarnClusterSchedulerBackend: Executor to
kill 3 does not exist!
16/01/11 12:55:20 INFO yarn.ApplicationMaster$AMEndpoint: Driver requested
to kill executor(s) .
16/01/11 14:29:55 WARN client.RemoteDriver: Shutting down driver because
RPC channel was closed.
16/01/11 14:29:55 INFO client.RemoteDriver: Shutting down remote driver.
16/01/11 14:29:55 INFO scheduler.DAGScheduler: Asked to cancel job 1
16/01/11 14:29:55 INFO client.RemoteDriver: Failed to run job
2fbbb881-988b-4454-ad9e-a20783aaf38e
java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:371)
        at
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
16/01/11 14:29:55 INFO cluster.YarnClusterScheduler: Cancelling stage 2
16/01/11 14:29:55 INFO cluster.YarnClusterScheduler: Removed TaskSet 2.0,
whose tasks have all completed, from pool
16/01/11 14:29:55 INFO cluster.YarnClusterScheduler: Stage 2 was cancelled
16/01/11 14:29:55 INFO scheduler.DAGScheduler: ShuffleMapStage 2
(mapPartitionsToPair at MapTran.java:31) failed in 6278.824 s
16/01/11 14:29:55 INFO handler.ContextHandler: stopped
o.s.j.s.ServletContextHandler{/metrics/json,null}
16/01/11 14:29:55 INFO handler.ContextHandler: stopped
o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/01/11 14:29:55 INFO handler.ContextHandler: stopped
o.s.j.s.ServletContextHandler{/api,null}
16/01/11 14:29:55 INFO handler.ContextHandler: stopped
o.s.j.s.ServletContextHandler{/,null}
16/01/11 14:29:55 INFO handler.ContextHandler: stopped
o.s.j.s.ServletContextHandler{/static,null}


*Best wishes.*
*Thanks.*

Mime
View raw message