hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-4848) TaskAttemptContext cast error during AM recovery
Date Wed, 05 Dec 2012 04:02:58 GMT
Jason Lowe created MAPREDUCE-4848:
-------------------------------------

             Summary: TaskAttemptContext cast error during AM recovery
                 Key: MAPREDUCE-4848
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4848
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mr-am
    Affects Versions: 0.23.4
            Reporter: Jason Lowe


Recently saw an AM that failed and tried to recover, but the subsequent attempt quickly exited
with its own failure during recovery:

{noformat}
2012-12-05 02:33:36,752 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher:
Error in dispatcher thread
java.lang.ClassCastException: org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl cannot
be cast to org.apache.hadoop.mapred.TaskAttemptContext
	at org.apache.hadoop.mapred.OutputCommitter.recoverTask(OutputCommitter.java:284)
	at org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$InterceptingEventHandler.handle(RecoveryService.java:361)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1211)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1177)
	at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357)
	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
	at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:958)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:135)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:926)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:918)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
	at org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.realDispatch(RecoveryService.java:285)
	at org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.dispatch(RecoveryService.java:281)
	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
	at java.lang.Thread.run(Thread.java:619)
2012-12-05 02:33:36,752 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher:
Exiting, bbye..
{noformat}

The RM then launched a third AM attempt which succeeded. The third attempt saw basically no
progress after parsing the history file from the second attempt and ran the job again from
scratch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message