hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM
Date Mon, 22 Apr 2013 23:29:17 GMT

    [ https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638568#comment-13638568

Jian He commented on YARN-562:

bq. Can we signal an object here so that we can be notified here that we are done launching
stale containers
The purpose I'm sleeping inside getNodeStatusAndUpdateContainersInContext() is not for synchronize
purpose. I had actually already join the thread in overridden rebootNodeStatusUpdater(). The
goal here is to simulate that "while(!containers.isEmpty())" is taking longer. Because its
very likely that it goes out of loop very quickly(only 1 running container) but the test thread
is still launching a bunch of containers. We want to test the wait loop will not go indefinitely,
while someone else keeps launching new containers

bq. We could start the launcher thread in the test code after sending the RESYNC event
I dont think we can start here, because its possible,though unlikely,the test thread has finished,
but we have not reached inside the cleanupContainers method at all 
> NM should reject containers allocated by previous RM
> ----------------------------------------------------
>                 Key: YARN-562
>                 URL: https://issues.apache.org/jira/browse/YARN-562
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch,
YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch
> Its possible that after RM shutdown, before AM goes down,AM still call startContainer
on NM with containers allocated by previous RM. When RM comes back, NM doesn't know whether
this container launch request comes from previous RM or the current RM. we should reject containers
allocated by previous RM 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message