Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 139EC10B29 for ; Tue, 25 Feb 2014 13:50:39 +0000 (UTC) Received: (qmail 35529 invoked by uid 500); 25 Feb 2014 13:50:38 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 35256 invoked by uid 500); 25 Feb 2014 13:50:36 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 35208 invoked by uid 99); 25 Feb 2014 13:50:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Feb 2014 13:50:34 +0000 Date: Tue, 25 Feb 2014 13:50:34 +0000 (UTC) From: "Hudson (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911578#comment-13911578 ] Hudson commented on YARN-1686: ------------------------------ SUCCESS: Integrated in Hadoop-Hdfs-trunk #1684 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1684/]) YARN-1686. Fixed NodeManager to properly handle any errors during re-registration after a RESYNC and thus avoid hanging. Contributed by Rohith Sharma. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1571474) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java > NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang. > ------------------------------------------------------------------------------------ > > Key: YARN-1686 > URL: https://issues.apache.org/jira/browse/YARN-1686 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.3.0 > Reporter: Rohith > Assignee: Rohith > Fix For: 2.4.0 > > Attachments: YARN-1686.1.patch, YARN-1686.2.patch, YARN-1686.3.patch > > > During start of NodeManager,if registration with resourcemanager throw exception then nodemager shutdown happens. > Consider case where NM-1 is registered with RM. RM issued Resync to NM. If any exception thrown in "resyncWithRM" (starts new thread which does not handle exception) during RESYNC evet, then this thread is lost. NodeManger enters hanged state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)