hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Luo <j...@merkleinc.com>
Subject RE: Common advices for hosting a huge table
Date Tue, 15 Dec 2015 20:17:39 GMT
I am in a very similar situation.

I guess you can try one of the options.

Option one: avoid online insert by preparing data off-line. Do something like http://hbase.apache.org/0.94/book/ops_mgt.html#importtsv

Option two: If the first option doesn’t work for you. It will be better to reduce your region
size and increase read/write timeout. So that you allow compact to happen while you insert
data, but since the size is smaller, it takes less time to compact/split. With this option,
you can have a table available 24/7 but the overall performance tends to go down dramatically
once some regions starts compacting.

Option three: If you can afford some down time, ie, two hours every day. You can manage compact/split
during that time. What I usually do is to run major-compact against all tables, then split
ones that is large so that it has enough room to grow for the next day’s insert.

I hope it helps.

From: 林豪 [mailto:linhao@qiyi.com]
Sent: Monday, December 14, 2015 11:51 PM
To: user@hbase.apache.org
Subject: Common advices for hosting a huge table

Hi, all:

We have a HBase Cluster which has several hundreds of region servers and each RS hosts nearly
300 regions. Currently one of our tables has increased to 16 TB and some region exceeds 10
GB. Major compaction on these regions is painful as it produces a lot of disk I/O and will
affect the performance of RS. The auto splitting size of IncreasingToUpperBoundRegionSplitPolicy
increased to 16 GB or more for this huge table. My solution is set attribute MAX_FILESIZE
on this table so ConstantSizeRegionSplitPolicy auto splitting will work again.

My question is: What are the common advices or configuration options to host such a huge table.
If we decide to limit the region size, how can we decide the optimised region size? If region
size is too large, major compaction is painful; but if region size is too small, then we have
a lot of small region which will overwhelm the RS.

林豪
云平台  研发工程师

爱奇艺公司
QIYI.com, Inc.
地址:上海市徐汇区宜山路1388号民润大厦6层
邮编:201103
手机:+86 136 1180 1618
电话:+86 21 5451 9520 8393
传真:+86 21 5451 9529
邮箱:linhao@qiyi.com<mailto:zhouxiqiao@qiyi.com>
网址:www.iQIYI.com<http://www.iqiyi.com/>
[cid:B21E048D-B27D-4528-92D0-36BAE7117128]<http://www.iqiyi.com/>

This email and any attachments transmitted with it are intended for use by the intended recipient(s)
only. If you have received this email in error, please notify the sender immediately and then
delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or
distribute this email without the author’s prior permission. We take precautions to minimize
the risk of transmitting software viruses, but we advise you to perform your own virus checks
on any attachment to this message. We cannot accept liability for any loss or damage caused
by software viruses. The information contained in this communication may be confidential and
may be subject to the attorney-client privilege.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message