hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Bauer <ad...@ugame.net.pl>
Subject Re: requests count per HRegion and rebalance
Date Mon, 07 Feb 2011 17:27:03 GMT
JIRA created: https://issues.apache.org/jira/browse/HBASE-3507

On 07.02.2011 18:07, Stack wrote:
> BenoƮt:
> Stick the below up in an issue.  Its good stuff.
> St.Ack
> On Mon, Feb 7, 2011 at 8:44 AM, tsuna<tsunanet@gmail.com>  wrote:
>> Hey this is pretty cool, I've been wanting something like that for a
>> while.  It's a little bit rough but the idea is there.  Ultimately, I
>> hope HBase will be able to provide fine-grained metrics on a
>> per-region basis and use that to do load balancing.
>> The problem right now is that load balancing is very costly for
>> clients, because it takes too long to move a region around in a real
>> cluster.  The region stays down for several seconds (!).  This is
>> really disruptive for high-throughput low-latency user-facing
>> applications.  So until we make the region migration process more
>> seamless, we can't really have a very aggressive / proactive load
>> balancer.
>> The other day I suggested an idea to Stack to change the region
>> migration process with virtually zero downtime.  It basically involves
>> telling the source region server where the region is going to land,
>> and telling the destination region server to prepare to receive the
>> region.  The source RS would do a first flush and remember the point
>> at which it is (there's a generation ID or something already, whatever
>> is needed for ACID).  The source RS would send an RPC to the
>> destination RS to tell it to start loading whatever was flushed and
>> then it would replicate all the edits to the destination RS.  Once
>> both RS are in sync, the source RS would block requests to the region
>> (by locking it), tell the destination RS about it, and after getting
>> the final ACK from it, would update META and send a special NSRE to
>> all the clients of the blocked requests.  The special NSRE would
>> basically just be like a normal NSRE but it would also say "hint: I
>> think the region is now on that RS".  From the clients' point of view,
>> the region downtime would be pretty minimal (almost unnoticeable).
>> Also, this scheme allows for opportunities such as warming up the
>> block cache in the destination RS before it starts serving.
>> --
>> Benoit "tsuna" Sigoure
>> Software Engineer @ www.StumbleUpon.com


Sebastian Bauer

View raw message