hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6168) Restarted RM may not inform AM about all existing containers
Date Mon, 27 Nov 2017 18:41:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267224#comment-16267224
] 

Hudson commented on YARN-6168:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13279 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13279/])
YARN-6168. Restarted RM may not inform AM about all existing containers. (jianhe: rev fedabcad42067ac7dd24de40fab6be2d3485a540)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/AllocateResponsePBImpl.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateResponse.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/DefaultAMSProcessor.java


> Restarted RM may not inform AM about all existing containers
> ------------------------------------------------------------
>
>                 Key: YARN-6168
>                 URL: https://issues.apache.org/jira/browse/YARN-6168
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Billie Rinaldi
>            Assignee: Chandni Singh
>             Fix For: 3.1.0
>
>         Attachments: YARN-6168.001.patch, YARN-6168.002.patch, YARN-6168.003.patch, YARN-6168.004.patch
>
>
> There appears to be a race condition when an RM is restarted. I had a situation where
the RMs and AM were down, but NMs and app containers were still running. When I restarted
the RM, the AM restarted, registered with the RM, and received its list of existing containers
before the NMs had reported all of their containers to the RM. The AM was only told about
some of the app's existing containers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message