hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Common advices for hosting a huge table
Date Wed, 16 Dec 2015 03:00:20 GMT
bq. the down time shall be down time for all the table on that RS

All the tables should be major compacted, but not necessarily around the
same time.
Major compaction schedule can be adjusted according to off peak hours for
the underlying table(s).

Cheers

On Tue, Dec 15, 2015 at 6:57 PM, 林豪 <linhao@qiyi.com> wrote:

> Thanks for your advices.
>
> For option three, I think major compaction on a large region will affect
> performance of the region server. So the down time shall be down time for
> all the table on that RS, am i Right?
>
>
>
>
> On 12/16/15, 5:12 AM, "Ted Yu" <yuzhihong@gmail.com> wrote:
>
> >w.r.t. option #1, also consider
> >http://hbase.apache.org/book.html#arch.bulk.load
> >
> >FYI
> >
> >On Tue, Dec 15, 2015 at 12:17 PM, Frank Luo <jluo@merkleinc.com> wrote:
> >
> >> I am in a very similar situation.
> >>
> >> I guess you can try one of the options.
> >>
> >> Option one: avoid online insert by preparing data off-line. Do something
> >> like http://hbase.apache.org/0.94/book/ops_mgt.html#importtsv
> >>
> >> Option two: If the first option doesn’t work for you. It will be better
> to
> >> reduce your region size and increase read/write timeout. So that you
> allow
> >> compact to happen while you insert data, but since the size is smaller,
> it
> >> takes less time to compact/split. With this option, you can have a table
> >> available 24/7 but the overall performance tends to go down dramatically
> >> once some regions starts compacting.
> >>
> >> Option three: If you can afford some down time, ie, two hours every day.
> >> You can manage compact/split during that time. What I usually do is to
> run
> >> major-compact against all tables, then split ones that is large so that
> it
> >> has enough room to grow for the next day’s insert.
> >>
> >> I hope it helps.
> >>
> >> From: 林豪 [mailto:linhao@qiyi.com]
> >> Sent: Monday, December 14, 2015 11:51 PM
> >> To: user@hbase.apache.org
> >> Subject: Common advices for hosting a huge table
> >>
> >> Hi, all:
> >>
> >> We have a HBase Cluster which has several hundreds of region servers and
> >> each RS hosts nearly 300 regions. Currently one of our tables has
> increased
> >> to 16 TB and some region exceeds 10 GB. Major compaction on these
> regions
> >> is painful as it produces a lot of disk I/O and will affect the
> performance
> >> of RS. The auto splitting size of
> IncreasingToUpperBoundRegionSplitPolicy
> >> increased to 16 GB or more for this huge table. My solution is set
> >> attribute MAX_FILESIZE on this table so ConstantSizeRegionSplitPolicy
> auto
> >> splitting will work again.
> >>
> >> My question is: What are the common advices or configuration options to
> >> host such a huge table. If we decide to limit the region size, how can
> we
> >> decide the optimised region size? If region size is too large, major
> >> compaction is painful; but if region size is too small, then we have a
> lot
> >> of small region which will overwhelm the RS.
> >>
> >> 林豪
> >> 云平台  研发工程师
> >>
> >> 爱奇艺公司
> >> QIYI.com, Inc.
> >> 地址:上海市徐汇区宜山路1388号民润大厦6层
> >> 邮编:201103
> >> 手机:+86 136 1180 1618
> >> 电话:+86 21 5451 9520 8393
> >> 传真:+86 21 5451 9529
> >> 邮箱:linhao@qiyi.com<mailto:zhouxiqiao@qiyi.com>
> >> 网址:www.iQIYI.com<http://www.iqiyi.com/>
> >> [cid:B21E048D-B27D-4528-92D0-36BAE7117128]<http://www.iqiyi.com/>
> >>
> >> This email and any attachments transmitted with it are intended for use
> by
> >> the intended recipient(s) only. If you have received this email in
> error,
> >> please notify the sender immediately and then delete it. If you are not
> the
> >> intended recipient, you must not keep, use, disclose, copy or distribute
> >> this email without the author’s prior permission. We take precautions to
> >> minimize the risk of transmitting software viruses, but we advise you to
> >> perform your own virus checks on any attachment to this message. We
> cannot
> >> accept liability for any loss or damage caused by software viruses. The
> >> information contained in this communication may be confidential and may
> be
> >> subject to the attorney-client privilege.
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message