hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shay Elbaz <sh...@gigya-inc.com>
Subject What are the drawbacks of having fewer and bigger regions?
Date Thu, 07 Sep 2017 15:23:53 GMT

Our 0.98.6-cdh5.3.0 12xRS cluster is configured with
hbase.hregion.max.filesize=10G. As a result, we ended up with one of the
tables having 580 regions where each region receives writes on a relatively
low rate - low enough that the WAL (33/128MB) forces flushes every few
hours. This in turn causes IO storm, hight cpu, GC, lots of cache misses
and so on, which cause latencies on the application layer. We are aiming to
to avoid those forced flushes.
A short analysis showed that the mean size of the *forced* memsotres is
~40MB, and the majority belong to the above table. See below a sorted list
of forced memstores extracted from log, and teh discussed table is
As we understand, there are 2 reasonable solutions:
1 - set MEMSTORE_FLUSHSIZE table-attr to 40 MB, and let the memstores flush
on their own when full. pros are obvious, but it could lead to too many
compactions. Should we worry about that?
2 - set hbase.hregion.max.filesize=40G and merge every 4 regions to one.
This should make the memstores reach 128M before the WAL reaches it's files
limit, based on the above 40M mean size. Moreover, we would set
hbase.hstore.compaction.max.size=10G to avoid compacting those huge files.

This rises a conceptual question - what are the drawbacks of having fewer
and bigger regions? From the RS point of view, what is the difference
between having 40 active regions of 10G each, and one active region of 40G
consists of mainly 4 x 10G files? Assuming the new big regions fill their
128M memstore every 2 or so hours, and the heap is large enough to hold the
index and bloom filters, how can it backfire?

Also, does major compaction respect hbase.hstore.compaction.max.size?

Thanks a lot,

6.3 M     tableB
7.5 M     tableB
7.6 M     tableB
8.7 M     tableA
12.3 M     tableA
14.2 M     tableC
14.5 M     tableC
14.6 M     tableC
15.8 M     tableA
17.1 M     tableA
17.2 M     tableA
17.5 M     tableD
18.0 M     tableD
18.0 M     tableE
19.2 M     tableF
19.7 M     tableF
22.2 M     tableF
23.3 M     tableF
25.3 M     tableG
26.5 M     tableG
31.4 M     tableA
31.5 M     tableA
31.8 M     tableA
33.0 M     tableA
33.1 M     tableA
33.1 M     tableA
33.6 M     tableA
34.4 M     tableA
35.1 M     tableD
35.5 M     tableE
35.6 M     tableA
36.2 M     tableA
37.1 M     tableA
37.6 M     tableA
37.9 M     tableA
38.8 M     tableA
40.3 M     tableA
40.6 M     tableA
41.0 M     tableA
41.4 M     tableA
41.8 M     tableA
44.1 M     tableA
44.3 M     tableA
45.1 M     tableI
45.2 M     tableA
47.1 M     tableH
47.4 M     tableA
48.5 M     tableA
48.9 M     tableA
50.1 M     tableA
51.0 M     tableA
51.4 M     tableA
51.4 M     tableA
51.7 M     tableA
51.8 M     tableA
51.9 M     tableA
51.9 M     tableG
52.4 M     tableA
52.9 M     tableA
53.1 M     tableA
53.2 M     tableG
53.4 M     tableA
53.7 M     tableG
53.9 M     tableG
56.6 M     tableA
61.8 M     tableA
62.9 M     tableA
63.8 M     tableA
64.6 M     tableA
64.7 M     tableA
66.1 M     tableA
68.2 M     tableA
69.3 M     tableA
69.9 M     tableE
70.4 M     tableD
71.2 M     tableA
71.8 M     tableD
72.9 M     tableE
73.4 M     tableE
89.5 M     tableI
94.9 M     tableH
95.5 M     tableH
96.7 M     tableH
98.2 M     tableH
99.4 M     tableH
101.9 M tableG
108.3 M tableG
110.3 M tableG
111.2 M tableG

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message