Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@locus.apache.org Received: (qmail 99366 invoked from network); 16 Oct 2008 20:42:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Oct 2008 20:42:56 -0000 Received: (qmail 42871 invoked by uid 500); 16 Oct 2008 20:42:52 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 42843 invoked by uid 500); 16 Oct 2008 20:42:52 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 42814 invoked by uid 99); 16 Oct 2008 20:42:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Oct 2008 13:42:52 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [63.203.238.117] (HELO dns.duboce.net) (63.203.238.117) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Oct 2008 20:41:44 +0000 Received: by dns.duboce.net (Postfix, from userid 1008) id 7905DC51D; Thu, 16 Oct 2008 12:11:48 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-26) on dns.duboce.net X-Spam-Level: Received: from durruti.desk.hq.powerset.com (durruti.desk.hq.powerset.com [208.84.6.64]) by dns.duboce.net (Postfix) with ESMTP id 66B9FC256 for ; Thu, 16 Oct 2008 12:11:46 -0700 (PDT) Message-ID: <48F7A70C.9010607@duboce.net> Date: Thu, 16 Oct 2008 13:41:48 -0700 From: stack User-Agent: Thunderbird 2.0.0.17 (Macintosh/20080914) MIME-Version: 1.0 To: hbase-user@hadoop.apache.org Subject: Re: HBase and hadoop cluster rebalance References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.4 Daniel Ploeg wrote: > Hi all, > > I performed a cluster rebalance on my test cluster yesterday (5 regionserver > / datanodes each with approx 400GB - total approx 2TB HDFS) and I would like > to know if the mailing lists have seen similar results to what I've seen. > I talked to the lads running hbase here at powerset. They believe they have seen something similar when they grow the cluster by some significant percentage (20-30%). The addition of new machines brings on a rebalancing and thereafter hbase runs "faster". > I had a single table with a single column family and loaded it up so that it > just about filled the entire cluster. Actually one or two of the nodes had > run out of space, yet the fifth machine only had 50% of its disks utilised > (which is why I though a rebalance was in order). There are a total of 1475 > regions in the cluster. Prior to starting the rebalance the cluster only had > about 250GB left to it's disposal. After the rebalance I now have almost > 800GB free. > If 1475 regions, update to 0.18.1 (coming soon). > Furthermore, I was performing read tests prior to the rebalance and getting > a response time of approx 500ms per row (each row has 10000 column instances > of the column family which were deserialised as part of the test). After the > rebalance my read times reduced to around 340ms. > > If you could have fewer columns in a family column, you'll get a bit better performance: HBASE-867. Good on you Daniel, St.Ack