Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 22AF9D1F9 for ; Mon, 24 Dec 2012 17:54:19 +0000 (UTC) Received: (qmail 80440 invoked by uid 500); 24 Dec 2012 17:54:16 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 80383 invoked by uid 500); 24 Dec 2012 17:54:16 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 80374 invoked by uid 99); 24 Dec 2012 17:54:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Dec 2012 17:54:16 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mohitanchlia@gmail.com designates 209.85.223.169 as permitted sender) Received: from [209.85.223.169] (HELO mail-ie0-f169.google.com) (209.85.223.169) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Dec 2012 17:54:08 +0000 Received: by mail-ie0-f169.google.com with SMTP id c14so9311407ieb.28 for ; Mon, 24 Dec 2012 09:53:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=HaKLzXb+6iAtZuzl8C4Pk5ZHsqLowQbU1vvfiOrL6K4=; b=h1aumjzmU3gl30CEUu9ENWiwRe3Dpxsb7hgMHlGz01WYTanMagCcS168j2Dq8MXTMa 9cT/GZI0ld0JSG476AAhptx7CHXUnbH/xbF+fk1IAsbaGJwuxl85CAVtW6ghFT+n4u+2 qyo+hypATX/2veOOwoNeO20A5INYXJLN2DEe/VRham1zycDw0G4OPVwEN6JLZnSlnps/ Kwv2wAFZTCKenNEuP28o+g+LnIlP2SVQiNkWLACLnN4W5avJpVP4k9CS5y5OFsJ9KniG cEGXcYExfvfMrtgMpFIn9e2ubu+m4kEZXJ4ZYIy8wshLmd7leinxWrd4rAFdFy3Aorb6 aVAA== MIME-Version: 1.0 Received: by 10.50.214.97 with SMTP id nz1mr19958098igc.36.1356371627863; Mon, 24 Dec 2012 09:53:47 -0800 (PST) Received: by 10.64.81.113 with HTTP; Mon, 24 Dec 2012 09:53:47 -0800 (PST) In-Reply-To: References: <8D483A32-1CD0-4C9E-AAC9-F4FE215735B4@mendeley.com> <50475173.1000901@capptain.com> Date: Mon, 24 Dec 2012 09:53:47 -0800 Message-ID: Subject: Re: Fixing badly distributed table manually. From: Mohit Anchlia To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=14dae9340d0f0fef6504d19ce1c3 X-Virus-Checked: Checked by ClamAV on apache.org --14dae9340d0f0fef6504d19ce1c3 Content-Type: text/plain; charset=ISO-8859-1 On Mon, Dec 24, 2012 at 8:27 AM, Ivan Balashov wrote: > > Vincent Barat writes: > > > > > Hi, > > > > Balancing regions between RS is correctly handled by HBase : I mean > > that your RSs always manage the same number of regions (the balancer > > takes care of it). > > > > Unfortunately, balancing all the regions of one particular table > > between the RS of your cluster is not always easy, since HBase (as > > for 0.90.3) when it comes to splitting a region, create the new one > > always on the same RS. This means that if you start with a 1 region > > only table, and then you insert lots of data into it, new regions > > will always be created to the same RS (if you insert is a M/R job, > > you saturate this RS). Eventually, the balancer at a time will > > decide to balance one of these regions to other RS, limiting the > > issue, but it is not controllable. > > > > Here at Capptain, we solved this problem by developing a special > > Python script, based on the HBase shell, allowing to entirely > > balance all the regions of all tables to all RS. It ensure that > > regions of tables are uniformly deployed on all RS of the cluster, > > with a minimum region transitions. > > > Is it possible to describe the logic at high level on what you did? > > It is fast, and even if it can trigger a lot of region transitions, > > there is very few impact at runtime and it can be run safely. > > > > If you are interested, just let me know, I can share it. > > > > Regards, > > > > Vincent, > > I would much like to see and possibly use the script that you > mentioned. We've just run into the same issue (after the table > has been truncated it was re-created with only 1 region, and > after data loading and manual splits we ended up having all > regions within the same RS). > > If you could share the script, it will be really appreciated, > I believe not only by me. > > Thanks, > Ivan > > > > > > > --14dae9340d0f0fef6504d19ce1c3--