hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
Date Fri, 30 Nov 2018 17:21:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705044#comment-16705044

Hadoop QA commented on YARN-9071:

| (x) *{color:red}-1 overall{color}* |
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  0s{color} | {color:blue}
Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} | {color:red}
YARN-9071 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute
for help. {color} |
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9071 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950192/YARN-9071.002.patch
| Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22759/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |

This message was automatically generated.

> NM and service AM don't have updated status for reinitialized containers
> ------------------------------------------------------------------------
>                 Key: YARN-9071
>                 URL: https://issues.apache.org/jira/browse/YARN-9071
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Billie Rinaldi
>            Assignee: Chandni Singh
>            Priority: Critical
>         Attachments: YARN-9071.001.patch, YARN-9071.002.patch
> Container resource monitoring is not stopped during the reinitialization process, and
this prevents the NM from obtaining updated process tree information when the container starts
running again. I observed a reinitialized container go from RUNNING to REINITIALIZING to REINITIALIZING_AWAITING_KILL
to SCHEDULED to RUNNING. Container monitoring was then started for a second time, but since
the trackingContainers entry had already been initialized for the container, ContainersMonitor
skipped finding the new PID and IP for the container. A possible solution would be to stop
the container monitoring in the reinitialization process so that the process tree information
would be initialized properly when monitoring is restarted. When the same container was stopped
by the NM later, the NM did not kill the container, and the service AM received an unexpected
event (stop at reinitializing).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message