Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 57C11200CE7 for ; Thu, 3 Aug 2017 04:52:04 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5621516AB23; Thu, 3 Aug 2017 02:52:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9E7C516AB22 for ; Thu, 3 Aug 2017 04:52:03 +0200 (CEST) Received: (qmail 33222 invoked by uid 500); 3 Aug 2017 02:52:02 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 33211 invoked by uid 99); 3 Aug 2017 02:52:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Aug 2017 02:52:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 2F6B8C1AA6 for ; Thu, 3 Aug 2017 02:52:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id xUR26DoQoNPQ for ; Thu, 3 Aug 2017 02:52:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 2A99A5F3D1 for ; Thu, 3 Aug 2017 02:52:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A75D6E0ADD for ; Thu, 3 Aug 2017 02:52:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4CBDA2464D for ; Thu, 3 Aug 2017 02:52:00 +0000 (UTC) Date: Thu, 3 Aug 2017 02:52:00 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-15547) Balancer should take into account number of PENDING_OPEN regions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 03 Aug 2017 02:52:04 -0000 [ https://issues.apache.org/jira/browse/HBASE-15547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112092#comment-16112092 ] stack commented on HBASE-15547: ------------------------------- Removing 2.0.0. Assign is different in 2.0.0. No PENDING_OPEN state for instance. > Balancer should take into account number of PENDING_OPEN regions > ---------------------------------------------------------------- > > Key: HBASE-15547 > URL: https://issues.apache.org/jira/browse/HBASE-15547 > Project: HBase > Issue Type: Improvement > Components: Balancer, Operability, Region Assignment > Affects Versions: 0.98.0, 1.0.0 > Reporter: Sean Busbey > Priority: Critical > Fix For: 1.5.0 > > > We recently had a cluster get into a bad state where a subset of region servers consistently could not open new regions (but could continue serving the regions they already hosted). > Recovering the cluster was just a matter of restarting region servers in sequence. However, this led to things getting substantially worse before they got better since the bulk assigner continued to place an uniform number of recovered regions across all servers, including onto those that could not open regions. > It would be useful if the balancer could penalize regionservers with a backlog of pending_open regions and place more work on those regionservers that are properly serving. -- This message was sent by Atlassian JIRA (v6.4.14#64029)