Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 179C217575 for ; Fri, 8 Jan 2016 08:52:41 +0000 (UTC) Received: (qmail 41642 invoked by uid 500); 8 Jan 2016 08:52:40 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 41568 invoked by uid 500); 8 Jan 2016 08:52:40 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 41192 invoked by uid 99); 8 Jan 2016 08:52:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Jan 2016 08:52:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id DDDFA2C1F69 for ; Fri, 8 Jan 2016 08:52:39 +0000 (UTC) Date: Fri, 8 Jan 2016 08:52:39 +0000 (UTC) From: "Yuqi Wang (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-2402) NM restart: Container recovery for Windows MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuqi Wang updated YARN-2402: ---------------------------- Attachment: YARN-2402-v2.patch Correct coding style. The issue is that the recovered container always exits with failed status because NM cannot find its exitCodeFile, so NM cannot get the actual exit code. I have tested the patch by getting the exit code from a recovered then failed container, and from a recovered then succeed container. I have checked there is also not unit test for getting exit code from the exitCodeFile for Unix or getting pid from the pidFile for Windows, maybe it is trivial to test this simple script. But if it is needed a unit test, I can add it afterwards. :) > NM restart: Container recovery for Windows > ------------------------------------------ > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Affects Versions: 2.6.0 > Reporter: Jason Lowe > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)