hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yeshwanth kumar <yeshwant...@gmail.com>
Subject HBase Region Size of 2.5 TB
Date Fri, 26 Aug 2016 22:23:01 GMT
Hi we are using  CDH 5.7 HBase 1.2

we are doing a performance testing over HBase through regular Load, which
has 4 Region Servers.

Input Data is compressed binary files around 2TB, which we process and
write as Key-Value pairs to HBase.
the output data size in  HBase is almost 4 times around 8TB, because we are
writing as text.
this process is a Map-Reduce Job,

when we are doing the load, we observed there's a lot of GC happening on
Region Server's ,so we changed couple of  parameters to decrease the GC

we increased the flush size to 128MB to 1 GB and compactionThreshold to 50
and  regionserver.maxlogs to 42
following are the configuration we changed from default.

hbase.hregion.memstore.flush.size = 1 GB
hbase.hregion.preclose.flush.size= 50 MB


after the load, we observed that HBase table has only 4 regions with each
of size around 2.5 TB

i am trying to understand, what configuration parameter caused this issue.

i was going through this article

Region split policy in our HBase is
according to Region Split policy, Region Server should create regions when
the region size limit is exceeded.
can some one explain me the root cause.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message