Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8BE1FE511 for ; Wed, 6 Feb 2013 07:45:25 +0000 (UTC) Received: (qmail 39431 invoked by uid 500); 6 Feb 2013 07:45:25 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 38878 invoked by uid 500); 6 Feb 2013 07:45:22 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 38801 invoked by uid 99); 6 Feb 2013 07:45:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Feb 2013 07:45:20 +0000 Date: Wed, 6 Feb 2013 07:45:20 +0000 (UTC) From: "Siddharth Seth (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-365) Each NM heartbeat should not generate and event for the Scheduler MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13572249#comment-13572249 ] Siddharth Seth commented on YARN-365: ------------------------------------- Xuan, I took a look at the patch. Some comments. The scheduler should really be pulling everything available in the node being processed. Pulling only a single element doesn't change things too much from what they are at the moment. The other schedulers will also need to be updated - since the heartbeat path is common for all of them, i.e. the FifoScheduler and FairScheduler. Also, some thought needs to be given to handling of cases where the node may have gone unhealthy etc. Digging into the patch, - Don't think RMNode should expose it's internal data structure via {{getNodeUpdateQueue}}. Instead, it should expose a method give back a List of ContainerUpdates. - Do we need an explicit setNextHeartBeat? Instead, the call to get container updates could be used for now. - NodeUpdateSchedulerEvent should be changed to remove the container information, instead of sending nulls. - Similarly for nodeUpdate in the CapacityScheduler - Rename UpdateContainerInfo to UpdatedContainerInfo The code does have some formatting issues - please take a look at http://wiki.apache.org/hadoop/HowToContribute for code formatting guidelines and other useful info. Also, could you please upload another doc with the latest approach, to stay in sync with the patch. Thanks! > Each NM heartbeat should not generate and event for the Scheduler > ----------------------------------------------------------------- > > Key: YARN-365 > URL: https://issues.apache.org/jira/browse/YARN-365 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler > Affects Versions: 0.23.5 > Reporter: Siddharth Seth > Assignee: Xuan Gong > Attachments: Prototype2.txt, YARN-365.1.patch, YARN-365.2.patch > > > Follow up from YARN-275 > https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira