Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 294FD200CF8 for ; Wed, 16 Aug 2017 01:26:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 286CF167BD5; Tue, 15 Aug 2017 23:26:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 60405167BD6 for ; Wed, 16 Aug 2017 01:26:08 +0200 (CEST) Received: (qmail 79667 invoked by uid 500); 15 Aug 2017 23:26:07 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 79399 invoked by uid 99); 15 Aug 2017 23:26:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Aug 2017 23:26:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id C2553C09EE for ; Tue, 15 Aug 2017 23:26:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 2WRliewPQY7W for ; Tue, 15 Aug 2017 23:26:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 544FD5F5B3 for ; Tue, 15 Aug 2017 23:26:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 16A01E0DFE for ; Tue, 15 Aug 2017 23:26:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id C3BFC241C5 for ; Tue, 15 Aug 2017 23:26:00 +0000 (UTC) Date: Tue, 15 Aug 2017 23:26:00 +0000 (UTC) From: "Wangda Tan (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-5764) NUMA awareness support for launching containers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 15 Aug 2017 23:26:09 -0000 [ https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128055#comment-16128055 ] Wangda Tan commented on YARN-5764: ---------------------------------- [~devaraj.k], Thanks for updating the patch, I checked the latest patch implementation. Some suggestions: 1) It added numa controller for both default container executor and linux container executor, does it make sense to use this feature under default container executor since CPU asks might be ignored in RM side (so asking 100 vcores is same as asking 1 vcores). 2) If we don't have to add support of DefaultContainerExecutor, probably we can leverage the latest ResourceHandlerModule, with that we can easier plug the numa related logics. 3) It seems this patch doesn't handle NM restart recovery. I think we need to recover what allocated by NM. Probably you can take a look at approach of https://issues.apache.org/jira/browse/YARN-6620, and some common libraries added in YARN-6620 (such as NM resource recovery) could be used to implement this feature. + [~shanekumpf@gmail.com]. > NUMA awareness support for launching containers > ----------------------------------------------- > > Key: YARN-5764 > URL: https://issues.apache.org/jira/browse/YARN-5764 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, yarn > Reporter: Olasoji > Assignee: Devaraj K > Attachments: NUMA Awareness for YARN Containers.pdf, NUMA Performance Results.pdf, YARN-5764-v0.patch, YARN-5764-v1.patch, YARN-5764-v2.patch, YARN-5764-v3.patch > > > The purpose of this feature is to improve Hadoop performance by minimizing costly remote memory accesses on non SMP systems. Yarn containers, on launch, will be pinned to a specific NUMA node and all subsequent memory allocations will be served by the same node, reducing remote memory accesses. The current default behavior is to spread memory across all NUMA nodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org