Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C39EA200BDB for ; Mon, 28 Nov 2016 04:22:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C2301160B21; Mon, 28 Nov 2016 03:22:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 18C10160B12 for ; Mon, 28 Nov 2016 04:21:59 +0100 (CET) Received: (qmail 75790 invoked by uid 500); 28 Nov 2016 03:21:58 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 75777 invoked by uid 99); 28 Nov 2016 03:21:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Nov 2016 03:21:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8D2AF2C03DC for ; Mon, 28 Nov 2016 03:21:58 +0000 (UTC) Date: Mon, 28 Nov 2016 03:21:58 +0000 (UTC) From: "Guanghao Zhang (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-17178) Add region balance throttling MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 28 Nov 2016 03:22:00 -0000 Guanghao Zhang created HBASE-17178: -------------------------------------- Summary: Add region balance throttling Key: HBASE-17178 URL: https://issues.apache.org/jira/browse/HBASE-17178 Project: HBase Issue Type: Improvement Components: Balancer Reporter: Guanghao Zhang Our online cluster serves dozens of tables and different tables serve for different services. If the balancer moves too many regions in the same time, it will decrease the availability for some table or some services. So we add region balance throttling on our online serve cluster. We introduce a new config hbase.balancer.max.balancing.regions, which means the max number of regions in transition when balancing. If we config this to 1 and a table have 100 regions, then the table will have 99 regions available at any time. It helps a lot for our use case and it has been running a long time our production cluster. But for some use case, we need the balancer run faster. If a cluster has 100 regionservers, then it add 50 new regionservers for peak requests. Then it need balancer run as soon as possible and let the cluster reach a balance state soon. Our idea is compute max number of regions in transition by the max balancing time and the average time of region in transition. Then the balancer use the computed value to throttling. Examples for understanding. A cluster has 100 regionservers, each regionserver has 200 regions and the average time of region in transition is 1 seconds, we config the max balancing time is 10 * 60 seconds. Case 1. One regionserver crash, the cluster at most need balance 200 regions. Then 200 / (10 * 60s / 1s) < 1, it means the max number of regions in transition is 1 when balancing. Then the balancer can move region one by one and the cluster will have high availability when balancing. Case 2. Add other 100 regionservers, the cluster at most need balance 10000 regions. Then 10000 / (10 * 60s / 1s) = 16.7, it means the max number of regions in transition is 17 when balancing. Then the cluster can reach a balance state within the max balancing time. Any suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)