hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Sautins <andy.saut...@returnpath.net>
Subject RE: Performance of region merges...
Date Mon, 28 Mar 2011 02:57:40 GMT

  Thank you Ted.  I had not heard of HMerge yet but will take a look.

  I appreciate the help.


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Sunday, March 27, 2011 8:54 PM
To: user@hbase.apache.org
Subject: Re: Performance of region merges...

Merge.java currently only accepts two regions.

Have you looked at HMerge ?
Its condition seems to satisfy your requirement:
   * When merging a normal table, the HBase instance must be online, but the
   * table must be disabled.

On Sun, Mar 27, 2011 at 5:06 PM, Andy Sautins

>    I have an issue I'm hoping to get some insight into.  We currently have
> a table that has roughly 18k regions.  When we originally created the table
> we didn't realize we should make the regions bigger  and have subsequently
> changed MAX_FILESIZE to something larger.  We are no longer rapidly creating
> new regions, but we still have the large number of regions.  I've been
> investigating using the merge tool to try to reduce the number of regions to
> something more reasonable for our needs.  The issue I've run into is that
> the merge tool seems to run somewhat slowly.  On a test table that has a
> sample of the data in our main table I have roughly 8MM rows each
> approximately 1k across 48 regions.  Using the merge tool I can reduce the
> number of regions down to 24 by running the merge tool over pairs of regions
> and all seems to work well.  However, for those 48 regions it takes roughly
> 30 minutes.  It's not the end of the world for this table if it takes a
> while, but given the fact that the cluster needs to be offline when using
> the merge tool merging has a larger impact that I'd like it to have.
>    I guess the question I have is if I have a lot more regions than I want
> is there a way to merge the regions down to a smaller number in a reasonably
> efficient manner.  Can I run the merge tool on multiple regions at the same
> time?  Are there alternatives to the merge tool?  Could I export/import the
> data or some other method?
>   We are currently running 0.90.1.
>   Any insights would be much appreciated.
>   Andy

View raw message