hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
Date Mon, 14 Aug 2017 21:07:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126396#comment-16126396

Jason Lowe commented on YARN-2402:

bq. Container recovery for Windows has been fully verified on Windows.

Excellent news!  Curious, how does the container recovery on Windows reconstruct the exit
code for containers that completed while the NM was down?  We should really dup this JIRA
to whatever JIRA added that functionality, but I didn't see any code that handled this on
Windows.  The prototype patch attached to this JIRA was doing something along those lines,
and I didn't see how Windows was properly recovering exit codes for completed containers without
something like it.

bq. there is also not unit test for getting exit code from the exitCodeFile for Unix or getting
pid from the pidFile for Windows, seems it is trivial to test this simple script.

Feel free to file a JIRA for adding those tests, and I can help review it.

> NM restart: Container recovery for Windows
> ------------------------------------------
>                 Key: YARN-2402
>                 URL: https://issues.apache.org/jira/browse/YARN-2402
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Yuqi Wang
>         Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch
> We should add container recovery for NM restart on Windows.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message