Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C1F58D617 for ; Fri, 19 Oct 2012 21:52:16 +0000 (UTC) Received: (qmail 27229 invoked by uid 500); 19 Oct 2012 21:52:13 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 27176 invoked by uid 500); 19 Oct 2012 21:52:13 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 27129 invoked by uid 99); 19 Oct 2012 21:52:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Oct 2012 21:52:13 +0000 Date: Fri, 19 Oct 2012 21:52:13 +0000 (UTC) From: "Aditya Kishore (JIRA)" To: issues@hbase.apache.org Message-ID: <317862378.3228.1350683533222.JavaMail.jiratomcat@arcas> In-Reply-To: <1009162098.45767.1342141536580.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Kishore updated HBASE-6389: ---------------------------------- Release Note: Reverts the cluster startup behavior to pre 0.94.0. Now, Master will wait until "hbase.master.wait.on.regionservers.mintostart" number of Region Servers have registered with it before it starts region assignment. The default value of this setting is 1. In large clusters with thousands of regions you may want to increase this to a higher number which is sufficient to handle the task of opening those region in parallel. If left to the default, at times, the Master could assign all regions to a single Region Server which will result in slow start and in worst case could OOM the Region Server (some time resulting in META inconsistency). > Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments > ---------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-6389 > URL: https://issues.apache.org/jira/browse/HBASE-6389 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.94.0, 0.96.0 > Reporter: Aditya Kishore > Assignee: Aditya Kishore > Priority: Critical > Fix For: 0.96.0 > > Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, HBASE-6389_trunk_v2.patch, HBASE-6389_trunk_v2.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt, testReplication.jstack > > > Continuing from HBASE-6375. > It seems I was mistaken in my assumption that changing the value of "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from default of 1) can help prevent assignment of all regions to one (or a small number of) region server(s). > While this was the case in 0.90.x and 0.92.x, the behavior has changed in 0.94.0 onwards to address HBASE-4993. > From 0.94.0 onwards, Master will proceed immediately after the timeout has lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not reached. > Reading the current conditions of waitForRegionServers() clarifies it > {code:title=ServerManager.java (trunk rev:1360470)} > .... > 581 /** > 582 * Wait for the region servers to report in. > 583 * We will wait until one of this condition is met: > 584 * - the master is stopped > 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached > 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > 587 * region servers is reached > 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND > 589 * there have been no new region server in for > 590 * 'hbase.master.wait.on.regionservers.interval' time > 591 * > 592 * @throws InterruptedException > 593 */ > 594 public void waitForRegionServers(MonitoredTask status) > 595 throws InterruptedException { > .... > .... > 612 while ( > 613 !this.master.isStopped() && > 614 slept < timeout && > 615 count < maxToStart && > 616 (lastCountChange+interval > now || count < minToStart) > 617 ){ > .... > {code} > So with the current conditions, the wait will end as soon as timeout is reached even lesser number of RS have checked-in with the Master and the master will proceed with the region assignment among these RSes alone. > As mentioned in -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, and I concur, this could have disastrous effect in large cluster especially now that MSLAB is turned on. > To enforce the required quorum as specified by "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, these conditions need to be modified as following > {code:title=ServerManager.java} > .. > /** > * Wait for the region servers to report in. > * We will wait until one of this condition is met: > * - the master is stopped > * - the 'hbase.master.wait.on.regionservers.maxtostart' number of > * region servers is reached > * - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND > * there have been no new region server in for > * 'hbase.master.wait.on.regionservers.interval' time AND > * the 'hbase.master.wait.on.regionservers.timeout' is reached > * > * @throws InterruptedException > */ > public void waitForRegionServers(MonitoredTask status) > .. > .. > int minToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.mintostart", 1); > int maxToStart = this.master.getConfiguration(). > getInt("hbase.master.wait.on.regionservers.maxtostart", Integer.MAX_VALUE); > if (maxToStart < minToStart) { > maxToStart = minToStart; > } > .. > .. > while ( > !this.master.isStopped() && > count < maxToStart && > (lastCountChange+interval > now || timeout > slept || count < minToStart) > ){ > .. > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira