Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 11B4310248 for ; Sat, 15 Feb 2014 00:24:58 +0000 (UTC) Received: (qmail 56245 invoked by uid 500); 15 Feb 2014 00:24:54 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 56174 invoked by uid 500); 15 Feb 2014 00:24:54 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 56166 invoked by uid 99); 15 Feb 2014 00:24:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Feb 2014 00:24:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com designates 209.85.215.41 as permitted sender) Received: from [209.85.215.41] (HELO mail-la0-f41.google.com) (209.85.215.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Feb 2014 00:24:49 +0000 Received: by mail-la0-f41.google.com with SMTP id mc6so10011566lab.0 for ; Fri, 14 Feb 2014 16:24:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=cDC89cbw+HgQBAwM5IGupYNAusMbS4fIufXV4elIcfY=; b=CwSY2AdL3vABbXjo9tLPi5pXUSxzGH4mfQ/EYKU/0Y8wJwjtsjhbGZGz1HML/G2gPx tPiDE/XFvt/n4oMkgjOG209iMeydStCilf4wkjs5eJ3JN9p6EG6EUGe1z7NdOz8+qv10 EEXtyDPQm/hxdNPcSboOOz8xHd79au9TQVm572di9163V9cx6OqV+gC/xg/RUCRqi+G+ 2H1Kc5DA/ctUoP8TmVgAdDOp9LTIxZYHakOvbnnnccNZp1TqJcRficDDyIZGHxxDNQVD hmMuZqPFPEX3MnZRPLqrATuCyrbdsuzA8Ap8PMIp6cMKaPJiZLsCJMju4uV8h8ZBnNK1 P75Q== MIME-Version: 1.0 X-Received: by 10.152.22.102 with SMTP id c6mr7546000laf.27.1392423867814; Fri, 14 Feb 2014 16:24:27 -0800 (PST) Received: by 10.112.156.167 with HTTP; Fri, 14 Feb 2014 16:24:27 -0800 (PST) In-Reply-To: References: Date: Fri, 14 Feb 2014 16:24:27 -0800 Message-ID: Subject: Re: uneven region distribution From: Ted Yu To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=089e0158b8940494ec04f266f2cc X-Virus-Checked: Checked by ClamAV on apache.org --089e0158b8940494ec04f266f2cc Content-Type: text/plain; charset=ISO-8859-1 Looking at bug fix since 0.94.2, I wonder if you are experiencing the following which went into 0.94.10 : HBASE-8432 a table with unbalanced regions will balance indefinitely Master log would tell us more. On Fri, Feb 14, 2014 at 4:18 PM, Rohit Kelkar wrote: > Sorry mis-stated the version, its 0.94.2 > > - R > > > On Fri, Feb 14, 2014 at 5:59 PM, Ted Yu wrote: > > > bq. it does not change the status of the assignments. > > > > Can you check / pastebin master log to see what caused the balancing to > > stop ? > > > > bq. attributing the region server crash to the disproportionately high > > number of regions on that server? > > > > Checking region server log on server5 should give us more clue. > > > > bq. 0.92.4 > > > > please consider upgrading :-) > > > > > > On Fri, Feb 14, 2014 at 3:52 PM, Rohit Kelkar > > wrote: > > > > > I am using hbase version 0.92.4 on a 5 node cluster. I am seeing that a > > > particular region server often crashes. A status 'simple' on hbase > shell > > > gives the following stats > > > > > > > > > HBase Shell; enter 'help' for list of supported commands. Type > > > "exit" to leave the HBase Shell Version 0.94.2, r1395367, Sun > > Oct 7 > > > 19:11:01 UTC 2012 > > > status 'simple' 4 live servers > > > server7:60020 1392017875910 requestsPerSecond=0, > > numberOfOnlineRegions=419, > > > usedHeapMB=3315, maxHeapMB=6127 > > > server4:60020 1392300859332 requestsPerSecond=843, > > > numberOfOnlineRegions=379, usedHeapMB=2070, maxHeapMB=6127 > > > server3:60020 1391583646998 requestsPerSecond=429, > > > numberOfOnlineRegions=653, usedHeapMB=3198, maxHeapMB=6127 > > > server6:60020 1391583647588 requestsPerSecond=0, > > numberOfOnlineRegions=966, > > > usedHeapMB=2975, maxHeapMB=6127 1 dead servers > > > server5,60020,1392108515637 Aggregate load: 1272, regions: 2417 > > > > > > The dead region server has 2417 regions as opposed to 419, 379, 653, > 966 > > > regions on other servers. Am I right in attributing the region server > > crash > > > to the disproportionately high number of regions on that server? > > > > > > If I invoke the balancer on hbase shell using the "balancer" command it > > > returns true. But it does not change the status of the assignments. > > > > > > - R > > > > > > --089e0158b8940494ec04f266f2cc--