hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1812) Job stays in PREP state for log time after RM Restarts
Date Tue, 11 Mar 2014 01:35:43 GMT

    [ https://issues.apache.org/jira/browse/YARN-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13929847#comment-13929847
] 

Jian He commented on YARN-1812:
-------------------------------

RMAppAttempt is possible to receive unexpected ContainerFinished event at New state when NM
is resyncing, because the attempt is asynchronously recovered after all the services are started
including ResourceTrackerService, during which RMAppAttempt can receive the ContainerFinished
event before the Recover event. But RMApp is still waiting for this ContainerFinished which
indicates that previous attempt has finished so that it can start a new attempt.
{code}
3184 2014-02-20 16:01:56,539 INFO  resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(141))
- USER=hrt_qa  OPERATION=Application Finished - Succeeded  TARGET=RMAppManager RESULT=SUCCESS
 APPID=      application_1392911035357_0001
 3185 2014-02-20 16:01:56,540 ERROR attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(647))
- Can't handle this event at current state
 3186 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: CONTAINER_FINISHED
at NEW
 3187   at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
 3188   at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
 3189   at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
 3190   at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:645)
 3191   at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:102)
 3192   at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:736)
 3193   at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:717)
 3194   at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
 3195   at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
 3196   at java.lang.Thread.run(Thread.java:722)
{code}

> Job stays in PREP state for log time after RM Restarts
> ------------------------------------------------------
>
>                 Key: YARN-1812
>                 URL: https://issues.apache.org/jira/browse/YARN-1812
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Yesha Vora
>            Assignee: Jian He
>
> Steps followed:
> 1) start a sort job with 80 maps and 5 reducers
> 2) restart Resource manager when 60 maps and 0 reducers are finished
> 3) Wait for job to come out of PREP state.
> The job does not come out of PREP state after 7-8 mins.
> After waiting for 7-8 mins, test kills the job.
> However, Sort job should not take this long time to come out of PREP state



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message