hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: How to merge regions in HBase?
Date Wed, 18 Jul 2012 14:32:18 GMT
Shouldn't it be possible for him to have empty regions if he has a TTL on his data? 

-- 
Bryan Beaudreault


On Wednesday, July 18, 2012 at 9:58 AM, Kevin O'dell wrote:

> Also, depending on your version of HBase that you are running you may have
> to bring down the cluster to merge and not just the table:
> 
> https://issues.apache.org/jira/browse/HBASE-1621
> 
> On Tue, Jul 17, 2012 at 7:26 PM, Amandeep Khurana <amansk@gmail.com (mailto:amansk@gmail.com)>
wrote:
> 
> > You shouldn't have empty regions. Using timestamp will give you
> > regions that are always half filled except the last one to which you
> > are writing the current time range. The moment that'll fill up, split
> > and you'll again be writing to the last region. How did you end up
> > with empty regions? Did you pre-split?
> > 
> > On Jul 17, 2012, at 7:15 PM, Michael Segel <michael_segel@hotmail.com (mailto:michael_segel@hotmail.com)>
> > wrote:
> > 
> > > Find a different row key?
> > > 
> > > The problem with merging regions is that once you merge the regions, any
> > net new regions will still have the same problem. So you'll have to merge
> > again, and again and again.
> > > You're always filling to the left of the last key.
> > > 
> > > In order to merge, you have to take the table offline. At least that's
> > my understanding. So its not a good thing.
> > > 
> > > 
> > > On Jul 17, 2012, at 11:08 AM, Ionut Ignatescu wrote:
> > > 
> > > > My usecase: I have several tabels with key starting with a timestamp.
> > Also,
> > > > this tabels have set data retention to 30 days.
> > > > Table size is around 1Tb(3Tb replicated) and data is inserted regular(on
> > > > 5minute, ~200Mb is inserted).
> > > > File size is set to 1Gb. I have this tables in use for almost half an
> > > > 
> > > 
> > 
> > year
> > > > and now a table has around 6k partitions and 40% of them are empty.
> > > > The problem: the number of regions per region server is now pretty high.
> > > > Questions:
> > > > Which approach is better?
> > > > - to merge adiacent empty partitions in a bigger one?
> > > > - to merge empty partitions to non-empty partitions?
> > > > Also, I'm wondering why regions merge is not part of major compactions
> > > > 
> > > 
> > 
> > and
> > > > why it's neccesary to stop the
> > > > entire fleet to solve this problem.
> > > > 
> > > > 
> > > > 
> > > > Regards,
> > > > 
> > > > Ionut I.
> 
> 
> 
> -- 
> Kevin O'Dell
> Customer Operations Engineer, Cloudera
> 
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message