hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: requests count per HRegion and rebalance
Date Mon, 07 Feb 2011 17:20:31 GMT
What version of HBase do you use ?
Looking at the line numbers, you are using version older than 0.90

Also, incrementing counter in startRegionOperation() doesn't consider the
following call (in 0.90 RC3):
  public Integer obtainRowLock(final byte [] row) throws IOException {
    startRegionOperation();

Regards

On Mon, Feb 7, 2011 at 9:15 AM, Sebastian Bauer <admin@ugame.net.pl> wrote:

> Yeah i know move take a while but we have small cluster(3 machines) and
> when few hot regions land in one machine then we had about 4000/1000/1000
> requests on machines after a rebalancing we have now about 1900/2200/1900 so
> its worked :)
>
> maybe someone can write tool to take few regions from overloaded
> regionserver and put on the others, but i have no skills to write it in ruby
> ;)
>
>
> On 07.02.2011 17:44, tsuna wrote:
>
>> Hey this is pretty cool, I've been wanting something like that for a
>> while.  It's a little bit rough but the idea is there.  Ultimately, I
>> hope HBase will be able to provide fine-grained metrics on a
>> per-region basis and use that to do load balancing.
>>
>> The problem right now is that load balancing is very costly for
>> clients, because it takes too long to move a region around in a real
>> cluster.  The region stays down for several seconds (!).  This is
>> really disruptive for high-throughput low-latency user-facing
>> applications.  So until we make the region migration process more
>> seamless, we can't really have a very aggressive / proactive load
>> balancer.
>>
>> The other day I suggested an idea to Stack to change the region
>> migration process with virtually zero downtime.  It basically involves
>> telling the source region server where the region is going to land,
>> and telling the destination region server to prepare to receive the
>> region.  The source RS would do a first flush and remember the point
>> at which it is (there's a generation ID or something already, whatever
>> is needed for ACID).  The source RS would send an RPC to the
>> destination RS to tell it to start loading whatever was flushed and
>> then it would replicate all the edits to the destination RS.  Once
>> both RS are in sync, the source RS would block requests to the region
>> (by locking it), tell the destination RS about it, and after getting
>> the final ACK from it, would update META and send a special NSRE to
>> all the clients of the blocked requests.  The special NSRE would
>> basically just be like a normal NSRE but it would also say "hint: I
>> think the region is now on that RS".  From the clients' point of view,
>> the region downtime would be pretty minimal (almost unnoticeable).
>> Also, this scheme allows for opportunities such as warming up the
>> block cache in the destination RS before it starts serving.
>>
>>
>
> --
>
> Pozdrawiam
> Sebastian Bauer
> -----------------------------------------------------
> http://tikecik.pl
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message