Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9BE35200D60 for ; Fri, 1 Dec 2017 10:03:11 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9A16F160C06; Fri, 1 Dec 2017 09:03:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DB9A2160BFB for ; Fri, 1 Dec 2017 10:03:10 +0100 (CET) Received: (qmail 59045 invoked by uid 500); 1 Dec 2017 09:03:10 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 59034 invoked by uid 99); 1 Dec 2017 09:03:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Dec 2017 09:03:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 90AAA1808A5 for ; Fri, 1 Dec 2017 09:03:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id aOCqFCv39k-h for ; Fri, 1 Dec 2017 09:03:07 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 3AAD761198 for ; Fri, 1 Dec 2017 09:03:07 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 59665E0E16 for ; Fri, 1 Dec 2017 09:03:06 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id E350224220 for ; Fri, 1 Dec 2017 09:03:01 +0000 (UTC) Date: Fri, 1 Dec 2017 09:03:01 +0000 (UTC) From: "Tao Yang (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 01 Dec 2017 09:03:11 -0000 [ https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274131#comment-16274131 ] Tao Yang commented on YARN-5139: -------------------------------- Async-scheduling decouples scheduling from nm heartbeats and use multiple allocation threads to make scheduling more efficient. It's very important for us and we have enabled async-scheduling mode of global scheduler on our production cluster which has thousands of nodes for half a year. All the problems we met were submitted to community and most of them were already resolved. Glad to hear that this feature are moving on and will pay more attention to multiple nodes lookup (YARN-7494) and node scorer (in design) for better placement, And we would like to participate in this work if needed. > [Umbrella] Move YARN scheduler towards global scheduler > ------------------------------------------------------- > > Key: YARN-5139 > URL: https://issues.apache.org/jira/browse/YARN-5139 > Project: Hadoop YARN > Issue Type: New Feature > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: Explanantions of Global Scheduling (YARN-5139) Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch > > > Existing YARN scheduler is based on node heartbeat. This can lead to sub-optimal decisions because scheduler can only look at one node at the time when scheduling resources. > Pseudo code of existing scheduling logic looks like: > {code} > for node in allNodes: > Go to parentQueue > Go to leafQueue > for application in leafQueue.applications: > for resource-request in application.resource-requests > try to schedule on node > {code} > Considering future complex resource placement requirements, such as node constraints (give me "a && b || c") or anti-affinity (do not allocate HBase regionsevers and Storm workers on the same host), we may need to consider moving YARN scheduler towards global scheduling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org