hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1372) Ensure all completed containers are reported to the AMs across RM restart
Date Wed, 27 Aug 2014 21:03:01 GMT

    [ https://issues.apache.org/jira/browse/YARN-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112838#comment-14112838
] 

Hadoop QA commented on YARN-1372:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12664729/YARN-1372.002_RMHandlesCompletedApp.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 9 new or modified
test files.

      {color:red}-1 javac{color}.  The applied patch generated 1261 javac compiler warnings
(more than the trunk's current 1260 warnings).

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

                  org.apache.hadoop.yarn.server.resourcemanager.TestRMNodeTransitions
                  org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions
                  org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestRMNMRPCResponseId
                  org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.TestRMContainerImpl
                  org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
                  org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
                  org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager
                  org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService
                  org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA
                  org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher
                  org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler
                  org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation
                  org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
                  org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4744//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/4744//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4744//console

This message is automatically generated.

> Ensure all completed containers are reported to the AMs across RM restart
> -------------------------------------------------------------------------
>
>                 Key: YARN-1372
>                 URL: https://issues.apache.org/jira/browse/YARN-1372
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1372.001.patch, YARN-1372.001.patch, YARN-1372.002_NMHandlesCompletedApp.patch,
YARN-1372.002_RMHandlesCompletedApp.patch, YARN-1372.002_RMHandlesCompletedApp.patch, YARN-1372.prelim.patch,
YARN-1372.prelim2.patch
>
>
> Currently the NM informs the RM about completed containers and then removes those containers
from the RM notification list. The RM passes on that completed container information to the
AM and the AM pulls this data. If the RM dies before the AM pulls this data then the AM may
not be able to get this information again. To fix this, NM should maintain a separate list
of such completed container notifications sent to the RM. After the AM has pulled the containers
from the RM then the RM will inform the NM about it and the NM can remove the completed container
from the new list. Upon re-register with the RM (after RM restart) the NM should send the
entire list of completed containers to the RM along with any other containers that completed
while the RM was dead. This ensures that the RM can inform the AM's about all completed containers.
Some container completions may be reported more than once since the AM may have pulled the
container but the RM may die before notifying the NM about the pull.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message