hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: HBase Region Size of 2.5 TB
Date Mon, 29 Aug 2016 01:56:01 GMT
Looking at source of IncreasingToUpperBoundRegionSplitPolicy, I don't see
other parameters being used.

FYI

On Sun, Aug 28, 2016 at 5:58 PM, yeshwanth kumar <yeshwanth43@gmail.com>
wrote:

> Hi Ted,
>
> thanks for the reply,
>
> i couldn't find the hbase.increasing.policy.initial.size in hbase conf,
> we haven't changed that value.
>
> so that means intial regionsize should be 2 GB, but the region size is
> 2.5TB
> i can manually split the regions, but trying to figure out the root cause.
> any other conf properties causing this behavior?
>
> please let me know,
>
> Thanks,
> Yeshwanth
>
>
>
> On Fri, Aug 26, 2016 at 5:41 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > From IncreasingToUpperBoundRegionSplitPolicy#configureForRegion():
> >
> >     initialSize = conf.getLong("hbase.increasing.policy.initial.size",
> > -1);
> >
> > ...
> >
> >     if (initialSize <= 0) {
> >
> >       initialSize = 2 * conf.getLong(HConstants.
> > HREGION_MEMSTORE_FLUSH_SIZE,
> >
> >                                      HTableDescriptor.
> > DEFAULT_MEMSTORE_FLUSH_SIZE);
> >
> > If you haven't changed the value for
> > "hbase.increasing.policy.initial.size", the last two lines should have
> > been
> > executed.
> >
> > initialSize would be 2GB in that case according to the config you listed.
> >
> >
> > FYI
> >
> > On Fri, Aug 26, 2016 at 3:23 PM, yeshwanth kumar <yeshwanth43@gmail.com>
> > wrote:
> >
> > > Hi we are using  CDH 5.7 HBase 1.2
> > >
> > > we are doing a performance testing over HBase through regular Load,
> which
> > > has 4 Region Servers.
> > >
> > > Input Data is compressed binary files around 2TB, which we process and
> > > write as Key-Value pairs to HBase.
> > > the output data size in  HBase is almost 4 times around 8TB, because we
> > are
> > > writing as text.
> > > this process is a Map-Reduce Job,
> > >
> > > when we are doing the load, we observed there's a lot of GC happening
> on
> > > Region Server's ,so we changed couple of  parameters to decrease the GC
> > > time.
> > >
> > > we increased the flush size to 128MB to 1 GB and compactionThreshold to
> > 50
> > > and  regionserver.maxlogs to 42
> > > following are the configuration we changed from default.
> > >
> > >
> > > hbase.hregion.memstore.flush.size = 1 GB
> > > hbase.hstore.max.filesize=10GB
> > > hbase.hregion.preclose.flush.size= 50 MB
> > >
> > > hbase.hstore.compactionThreshold=50
> > > hbase.regionserver.maxlogs=42
> > >
> > > after the load, we observed that HBase table has only 4 regions with
> each
> > > of size around 2.5 TB
> > >
> > > i am trying to understand, what configuration parameter caused this
> > issue.
> > >
> > > i was going through this article
> > > http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
> > >
> > > Region split policy in our HBase is
> > > org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSp
> > > litPolicy
> > > according to Region Split policy, Region Server should create regions
> > when
> > > the region size limit is exceeded.
> > > can some one explain me the root cause.
> > >
> > >
> > > Thanks,
> > > Yeshwanth
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message