hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Omkar Vinit Joshi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-733) TestNMClient fails occasionally
Date Fri, 31 May 2013 18:24:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671712#comment-13671712
] 

Omkar Vinit Joshi commented on YARN-733:
----------------------------------------

[~zjshen] small nit 
bq. may still need some time to make the container actually started or stopped because of
its asynchronous
may still need some time to either start or stop the container because of its asynchronous

I hope we are not doing getContainerStatus after Application is finished in which case we
won't have tokens at NM side for authentication.
                
> TestNMClient fails occasionally
> -------------------------------
>
>                 Key: YARN-733
>                 URL: https://issues.apache.org/jira/browse/YARN-733
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: YARN-733.1.patch
>
>
> The problem happens at:
> {code}
>         // getContainerStatus can be called after stopContainer
>         try {
>           ContainerStatus status = nmClient.getContainerStatus(
>               container.getId(), container.getNodeId(),
>               container.getContainerToken());
>           assertEquals(container.getId(), status.getContainerId());
>           assertEquals(ContainerState.RUNNING, status.getState());
>           assertTrue("" + i, status.getDiagnostics().contains(
>               "Container killed by the ApplicationMaster."));
>           assertEquals(-1000, status.getExitStatus());
>         } catch (YarnRemoteException e) {
>           fail("Exception is not expected");
>         }
> {code}
> NMClientImpl#stopContainer returns, but container hasn't been stopped immediately. ContainerManangerImpl
implements stopContainer in async style. Therefore, the container's status is in transition.
NMClientImpl#getContainerStatus immediately after stopContainer will get either the RUNNING
status or the COMPLETE one.
> There will be the similar problem wrt NMClientImpl#startContainer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message