hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kleegrewe, Christian" <christian.kleegr...@siemens.com>
Subject Region split behavior
Date Tue, 24 May 2011 15:03:56 GMT
Dear all,

We have a small test cluster with 5 nodes, 1 master and 4 datanodes. The nodes are installed
with Ubuntu desktop 10.10, hadoop version 'Hadoop 0.20.2-CDH3B4' and hbase version 0.90.1-CDH3B4.
The hbase database is well balanced and contains one table (TAB_1) containing 270.000.000
data records. The table consists of 84 regions each with 1 up to 3 storefiles and 100Mbyte
-> 216 Mbyte of size for the regions. The rowkey is a monotonic raising timestamp, wich
I know is bad for parallelization but we are only testing some map features so far.

When I create TAB_1 it distributes very good over the 4 region servers, so that each server
contains 20 - 22 regions after creation. When I create a second table (TAB_2) with the same
rowkey and the same data this table does not distribute over the servers, but is only stored
on one of the regionserver (R1). The other nodes (R2, R3, R4) are not used for storage. The
cluster still remains balanced but I can see drifting regions from TAB_1 away from R1 which
used for storing TAB_2. After a while there are no regions of TAB_1 left on R1 and now the
load balancer starts moving regions of TAB_2 to R2 .. R4. The active region that is written
into remains on R1.

How can this behavious be explained. I normally would expect that TAB_2 will distribute over
all 4 regionservers when creating and would not be stored on one of the servers and have the
load balancer in the background shift the data.

Is this a normal hbase behaviour or is there some missconfiguration in my cluster?

Thanks in advance



Siemens AG
Corporate Technology
Corporate Research and Technologies
Otto-Hahn-Ring 6
81739 München, Deutschland
Tel.: +49 (89) 636-42722
Fax: +49 (89) 636-41423

Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard Cromme; Vorstand: Peter
Löscher, Vorsitzender; Wolfgang Dehen, Brigitte Ederer, Joe Kaeser, Barbara Kux, Hermann
Requardt, Siegfried Russwurm, Peter Y. Solmssen; Sitz der Gesellschaft: Berlin und München,
Deutschland; Registergericht: Berlin Charlottenburg, HRB 12300, München, HRB 6684; WEEE-Reg.-Nr.
DE 23691322

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message