Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3726ACDA9 for ; Mon, 22 Dec 2014 06:58:15 +0000 (UTC) Received: (qmail 27520 invoked by uid 500); 22 Dec 2014 06:58:13 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 27475 invoked by uid 500); 22 Dec 2014 06:58:13 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 27463 invoked by uid 99); 22 Dec 2014 06:58:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Dec 2014 06:58:13 +0000 Date: Mon, 22 Dec 2014 06:58:13 +0000 (UTC) From: "Varun Saxena (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14255497#comment-14255497 ] Varun Saxena commented on YARN-2962: ------------------------------------ [~rakeshr], thanks for your input. ApplicationID in YARN is of the format {noformat}application_[cluster timestamp]_[sequence number]{noformat} Here sequence number has 4 digits and is in the range 0000-9999. Going along the lines of what you are saying, I think we can break the sequence number part of ApplicationID as cluster timestamp will probably be same for most of the application IDs'. My suggestion is to have it as {noformat}(app_root)\application_[cluster timestamp]_\[first 2 digits of sequence number]\[last 2 digits]{noformat} We can view it as under : {noformat} * |--- RM_APP_ROOT * | |----- (application_{cluster timestamp}_) * | | |----- (00 to 99) * | | | |------ (00 to 99) * | | | | |----- (#ApplicationAttemptIds) {noformat} [~rakeshr] and [~kasha], kindly comment on the approach. One constraint is that this would entail a larger number of contacts to ZK when RM is recovering. I am not sure how many znodes can lead to reaching limit of 1 MB. We can break sequence number as 1 digit and last 3 digit as well. Moreover, I dont see much of an issue with application attempt znodes as max-attempts by default are limited to 2. > ZKRMStateStore: Limit the number of znodes under a znode > -------------------------------------------------------- > > Key: YARN-2962 > URL: https://issues.apache.org/jira/browse/YARN-2962 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.6.0 > Reporter: Karthik Kambatla > Assignee: Varun Saxena > Priority: Critical > > We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)