From dev-return-74852-archive-asf-public=cust-asf.ponee.io@hbase.apache.org Thu Jun 20 05:11:50 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id EEF0218060F for ; Thu, 20 Jun 2019 07:11:49 +0200 (CEST) Received: (qmail 93547 invoked by uid 500); 20 Jun 2019 05:11:48 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 93535 invoked by uid 99); 20 Jun 2019 05:11:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jun 2019 05:11:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id B296618106E for ; Thu, 20 Jun 2019 05:11:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3 X-Spam-Level: *** X-Spam-Status: No, score=3 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, KAM_LINEPADDING=1.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id i3_J69cfGcst for ; Thu, 20 Jun 2019 05:11:44 +0000 (UTC) Received: from mail-ot1-f43.google.com (mail-ot1-f43.google.com [209.85.210.43]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 010615F490 for ; Thu, 20 Jun 2019 05:11:43 +0000 (UTC) Received: by mail-ot1-f43.google.com with SMTP id l15so1464289otn.9 for ; Wed, 19 Jun 2019 22:11:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=XHF+eE5cI8QU+MqHOOq96FZPrA41QQEs0OdakrgC77o=; b=J4Mq9zwJilUb7WhS8OOMzAgtC1gD//Wefqu35LMRZcpu7mdbeHTV0tWfkoKpAAxXxV G72A8LcpT8dHyL3fnjsl7JtiYCSWoZxUW21/NXa+a4Ek1k4vHGXx7EysGMZiMDyolNJ2 1x4cB1C9OIhOWK1CiZkFPQj/MVTR9mzPlWJIScLm2Ws4LsHrYcsqD8m5EJvAXzHgzN7l BVKYRMUd+XGOPC1FZtEj+nPDD/BFuFy6w4Me/oOGp0tdTubZSJGRX2LxTY5XkaGbZ4u1 aUGalrN5ATELssAXtPInch0YdqAX3EXd3d4xR5+6mjcZalsDYf0KOJKxOp92Ck8qqiYm oaIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=XHF+eE5cI8QU+MqHOOq96FZPrA41QQEs0OdakrgC77o=; b=MC4Ezf5lZ+A9H5rYl1T7XrmWMMX8JSTEBphDqij5MWBXQkTt5Bbe9+jWXHIAAYnZNc rO6pDVMhB94Sf1HUv8F79WXjJuqPuAr6DTTUuuEdmYeZTMqLybppllbbhtChRlMP8B+G BJwA0aAlDyhSrqdBwwHBLU5na89scDaFZyEoWvw7gqx52GFNOoZxJwr0EyzV862cGQpV TLzwPVWHDqH67vmorMaDS0wqdf7D1VA8KmKRRCCl2Sww6U9UwbpJ5WRoH7T0DZLmRTSd ql3AfmXKgQfttBam/rGNfe3Ia4k/xM9YuUm1QwXi4qgDDv8MrnSOOw2m+UGaxxMeaqbD o9sg== X-Gm-Message-State: APjAAAWAYazGr7LfMd92KJy0yY1jNnWofi04GEyju/K9xLV2bvLkGspw fLJviAPGjQ/DIxokg8g/7SYU4ZK91Sq8fm43rnYZkSGm X-Google-Smtp-Source: APXvYqxzf9zvzN+wY/39ddsnjHsim/ehD4jHhkPKzcY52mupd3e5RyIMENcSoxvz7Rsv2yH19Drju+c0Dao0jhAAMtU= X-Received: by 2002:a9d:7a53:: with SMTP id z19mr9489448otm.134.1561007496276; Wed, 19 Jun 2019 22:11:36 -0700 (PDT) MIME-Version: 1.0 References: <265c9f08.3086.16b6873c5a9.Coremail.liuxing19890423@163.com> In-Reply-To: <265c9f08.3086.16b6873c5a9.Coremail.liuxing19890423@163.com> From: ramkrishna vasudevan Date: Thu, 20 Jun 2019 10:41:24 +0530 Message-ID: Subject: Re: Adding a new balancer to HBase To: dev Content-Type: multipart/alternative; boundary="0000000000004eb57b058bba61a5" --0000000000004eb57b058bba61a5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Seems a very good idea for cloud servers. Pls feel free to raise a JIRA and contribute your patch. Regards Ram On Tue, Jun 18, 2019 at 8:09 AM =E5=88=98=E6=96=B0=E6=98=9F wrote: > > > I'm interested on this. It sounds like a weighted load balancer and > valuable for those users deploy their hbase cluster on cloud servers. > You can create a jira and make a patch for better discussion. > > > > > > > > At 2019-06-18 05:00:54, "Pierre Zemb" wrote: > >Hi! > > > >My name is Pierre, I'm working at OVH, an European cloud-provider. Our > >team, Observability, is heavily relying on HBase to store telemetry. We > >would like to open the discussion about adding into 1.4X and 2.X a new > >Balancer. > >< > https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#our-situ= ation > >Our > >situation > > > >The Observability team in OVH is responsible to handle logs and metrics > >from all servers/applications/equipments within OVH. HBase is used as th= e > >datastore for metrics. We are using an open-source software called Warp1= 0 > > to handle all the metrics coming from OVH's > >infrastructure. We are operating three HBase 1.4 clusters, including one > >with 218 RegionServers which is growing every month. > > > >We found out that *in our usecase*(single table, dedicated HBase and > Hadoop > >tuned for our usecase, good key distribution)*, the number of regions pe= r > >RS was the real limit for us*. > > > >Over the years, due to historical reasons and also the need to benchmark > >new machines, we ended-up with differents groups of hardware: some serve= rs > >can handle only 180 regions, whereas the biggest can handle more than 90= 0. > >Because of such a difference, we had to disable the LoadBalancing to avo= id > >the roundRobinAssigmnent. We developed some internal tooling which are > >responsible for load balancing regions across RegionServers. That was 1.= 5 > >year ago. > > > >Today, we are thinking about fully integrate it within HBase, using the > >LoadBalancer interface. We started working on a new Balancer called > >HeterogeneousBalancer, that will be able to fullfill our need. > >< > https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#how-does= -it-works > >How > >does it works? > > > >A rule file is loaded before balancing. It contains lines of rules. A ru= le > >is composed of a regexp for hostname, and a limit. For example, we could > >have: > > > >rs[0-9] 200 > >rs1[0-9] 50 > > > >RegionServers with hostname matching the first rules will have a limit o= f > >200, and the others 50. If there's no match, a default is set. > > > >Thanks to the rule, we have two informations: the max number of regions > for > >this cluster, and the rules for each servers. HeterogeneousBalancer will > >try to balance regions according to their capacity. > > > >Let's take an example. Let's say that we have 20 RS: > > > > - 10 RS, named through rs0 to rs9 loaded with 60 regions each, and ea= ch > > can handle 200 regions. > > - 10 RS, named through rs10 to rs19 loaded with 60 regions each, and > > each can support 50 regions. > > > >Based on the following rules: > > > >rs[0-9] 200 > >rs1[0-9] 50 > > > >The second group is overloaded, whereas the first group has plenty of > space. > > > >We know that we can handle at maximum *2500 regions* (200*10 + 50*10) an= d > >we have currently *1200 regions* (60*20). HeterogeneousBalancer will > >understand that the cluster is *full at 48.0%* (1200/2500). Based on thi= s > >information, we will then *try to put all the RegionServers to ~48% of > load > >according to the rules.* In this case, it will move regions from the > second > >group to the first. > > > >The balancer will: > > > > - compute how many regions needs to be moved. In our example, by movi= ng > > 36 regions on rs10, we could go from 120.0% to 46.0% > > - select regions with lowest data-locality > > - try to find an appropriate RS for the region. We will take the lowe= st > > available RS. > > > >< > https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#current-= status > >Current > >status > > > >We started the implementation, but it is not finished yet. we are planni= ng > >to deploy it on a cluster with lower impact for testing, and then put it > on > >our biggest cluster. > > > >We have some basic implementation of all methods, but we need to add mor= e > >tests and make the code more robust. You can find the proof-of-concept > here > >< > https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/s= rc/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.= java > >, > >and some early tests here > >< > https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/s= rc/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.= java > >, > >here > >< > https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/s= rc/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalan= cerBalance.java > >, > >and here > >< > https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/s= rc/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalan= cerRules.java > >. > >We wrote the balancer for our use-case, which means that: > > > > - there is one table > > - there is no region-replica > > - good key dispersion > > - there is no regions on master > > > >However, we believe that this will not be too complicated to implement. = We > >are also thinking about the possibility to limit overassigments of regio= ns > >by moving them to the least loaded RS. > > > >Even if the balancing strategy seems simple, we do think that having the > >possibility to run HBase cluster on heterogeneous hardware is vital, > >especially in cloud environment, because you may not be able to buy the > >same server specs throughout the years. > > > >What do you think about our approach? Are you interested for such a > >contribution? > >--- > > > >Pierre ZEMB - OVH Group > >Observability/Metrics - Infrastructure Engineer > >pierrezemb.fr > >+33 7 86 95 61 65 > --0000000000004eb57b058bba61a5--