Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 987D3200C0F for ; Thu, 2 Feb 2017 12:14:16 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 970E9160B57; Thu, 2 Feb 2017 11:14:16 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1E5B0160B54 for ; Thu, 2 Feb 2017 12:14:13 +0100 (CET) Received: (qmail 82710 invoked by uid 500); 2 Feb 2017 11:14:12 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 82699 invoked by uid 99); 2 Feb 2017 11:14:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Feb 2017 11:14:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E9D2118285E for ; Thu, 2 Feb 2017 11:14:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.999 X-Spam-Level: X-Spam-Status: No, score=-1.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id l8ZTmIW-w--u for ; Thu, 2 Feb 2017 11:14:11 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 5EFF45F30B for ; Thu, 2 Feb 2017 11:14:10 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 920F5E03A2 for ; Thu, 2 Feb 2017 11:13:57 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0669E2528C for ; Thu, 2 Feb 2017 11:13:55 +0000 (UTC) Date: Thu, 2 Feb 2017 11:13:55 +0000 (UTC) From: "Sunil G (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-5889) Improve user-limit calculation in capacity scheduler MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 02 Feb 2017 11:14:16 -0000 [ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-5889: -------------------------- Attachment: YARN-5889.0009.patch Thanks [~leftnoteasy] for helping to review the patch thoroughly. Updating a new patch bq.4) isRecomputeNeeded: I am slightly confused here. I think we might need null check. I ll help to share detailed view for that. Assume that there are no precomputed user-limit at the start when RM is started or queue is refreshed. So all cache ill be empty, and we ll do our first computation when a container request comes. So in this case, userLimitPerSchedulingMode will be null. And we ll do a recompute and then userLimitPerSchedulingMode will have some entires. So a null check is needed at the very beginning scenario. I can see whether this check can be done outside or note. Am i missing something here? pls help to share your view. bq.And also, we don't need latestVersionOfUserCount, instead we should call latestVersionOfUsersState.get(). userLimitNeedsRecompute or getLatestVersionOfUsersState are not writeLock protected. Hence in getComputedResourceLimitForAll/ActiveUsers , it may be possible that latestVersionOfUsersState may change within writeLock block while operating. Since we save the saved version of latestVersionOfUserCount to update local cache (per partition nd sch mode), even though some other thread changed the real latestVersionOfUsersState, cache will be invalidate it correctly. Pls pool in your thoughts. > Improve user-limit calculation in capacity scheduler > ---------------------------------------------------- > > Key: YARN-5889 > URL: https://issues.apache.org/jira/browse/YARN-5889 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Reporter: Sunil G > Assignee: Sunil G > Attachments: YARN-5889.0001.patch, YARN-5889.0001.suggested.patchnotes, YARN-5889.0002.patch, YARN-5889.0003.patch, YARN-5889.0004.patch, YARN-5889.0005.patch, YARN-5889.0006.patch, YARN-5889.0007.patch, YARN-5889.0008.patch, YARN-5889.0009.patch, YARN-5889.v0.patch, YARN-5889.v1.patch, YARN-5889.v2.patch > > > Currently user-limit is computed during every heartbeat allocation cycle with a write lock. To improve performance, this tickets is focussing on moving user-limit calculation out of heartbeat allocation flow. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org