hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guillermo Ortiz <konstt2...@gmail.com>
Subject Re: Weird behavior splitting regions
Date Tue, 15 Apr 2014 10:35:18 GMT
I read the article, that's why I typed the question, because I didn't
understand the result I got.

Oh, yes!!, that's true, so silly.
I think some of the files are pretty small because the table has two
families and one of them is much smaller than the another one. So, it has
been splitted many  times. The big regions get a size close to 1Gb, but the
smaller regions has a final size pretty small because they have been
splitted a lot of times.

What I don't know, it's why HBase decides to split the table so late, not
when I create the table presplitted if not, two hours later or whatever.
Anyway, that's my error, I'm just curious about it.


2014-04-15 12:17 GMT+02:00 divye sheth <divs.sheth@gmail.com>:

> The default split policy in hbase0.94.x is IncreaseToUpperBound rather than
> ConstantSizeSplitPolicy which was the default in the older versions of
> hbase.
>
> Please refer to the link given below to understand how a
> IncreaseToUpperBoundSplitPolicy works:
> http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
> check the auto-splitting section
>
> Hope this answers your question
>
> Thanks
> Divye Sheth
>
>
>
> On Tue, Apr 15, 2014 at 3:36 PM, Bharath Vissapragada <
> bharathv@cloudera.com
> > wrote:
>
> > >There're some new regions that they're just a some KBytes!. Why they are
> > so
> > small?? When does HBase decide to split? because it started to split two
> > hours later to create the table.
> >
> > When hbase does a split, it doesn't actually split at the disk/file
> level.
> > Its just a metadata operation which creates new regions that contain the
> > reference files that still point to old HFiles. That is the reason you
> find
> > KB size regions.
> >
> > >I thought major compaction just happen once at day and compact many
> files
> > per region. Data is always the same here, I don't inject new data.
> >
> > IIRC sometimes minor compactions get promoted to major compactions based
> on
> > some criteria, but I'll leave it for others to answer!
> >
> >
> >
> > On Tue, Apr 15, 2014 at 3:15 PM, Guillermo Ortiz <konstt2000@gmail.com
> > >wrote:
> >
> > > I have a table in Hbase that sizes around 96Gb,
> > >
> > > I generate 4 regions of 30Gb. Some time, table starts to split because
> > the
> > > max size for region is 1Gb (I just realize of that, I'm going to change
> > it
> > > or create more pre-splits.).
> > >
> > > There're two things that I don't understand. how is it creating the
> > splits?
> > > right now I have 130 regions and growing. The problem is the size of
> the
> > > new regions:
> > >
> > > 1.7 M    /hbase/filters/4ddbc34a2242e44c03121ae4608788a2
> > > 1.6 G    /hbase/filters/548bdcec79cfe9a99fa57cb18f801be2
> > > 3.1 G    /hbase/filters/58b50df089bd9d4d1f079f53238e060d
> > > 2.5 M    /hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f
> > > 1.9 G    /hbase/filters/5b0a35b5735a473b7e804c4b045ce374
> > > 883.4 M  /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c
> > > 1.7 M    /hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7
> > > 632.4 M  /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2
> > >
> > > There're some new regions that they're just a some KBytes!. Why they
> are
> > so
> > > small?? When does HBase decide to split? because it started to split
> two
> > > hours later to create the table.
> > >
> > > One, I create the table and insert data, I don't insert new data or
> > modify
> > > them.
> > >
> > >
> > > Another interested point it's why there're major compactions:
> > > 2014-04-15 11:33:47,400 INFO
> org.apache.hadoop.hbase.regionserver.Store:
> > > Renaming compacted file at
> > >
> > >
> >
> hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c
> > > to
> > >
> > >
> >
> hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/d/df90c260cb4e4256a153dd178244f04c
> > > 2014-04-15 11:33:47,407 INFO
> > > org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROWCOL
> > > (CompoundBloomFilter) metadata for df90c260cb4e4256a153dd178244f04c
> > > 2014-04-15 11:33:47,416 INFO
> org.apache.hadoop.hbase.regionserver.Store:*
> > > Completed major compaction of 1 file*(s) in d of
> > > filters,51,1397554175140.ef994715505054299ede8c48c600cea4. into
> > > df90c260cb4e4256a153dd178244f04c, size=789.1 M; total size for store is
> > > 789.1 M
> > > 2014-04-15 11:33:47,416 INFO
> > > org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest:
> > > completed compaction:
> > > regionName=filters,51,1397554175140.ef994715505054299ede8c48c600cea4.,
> > > storeName=d, fileCount=1, fileSize=1.5 G, priority=6,
> > time=414761474510060;
> > > duration=7sec
> > >
> > > I thought major compaction just happen once at day and compact many
> files
> > > per region. Data is always the same here, I don't inject new data.
> > >
> > >
> > > I'm working with 0.94.6 CDH44. I'm going to change the size of the
> > regions,
> > > but, I would like to understand why things happen.
> > >
> > > Thank you.
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message