hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Bauer <ad...@ugame.net.pl>
Subject Re: requests count per HRegion and rebalance
Date Mon, 07 Feb 2011 17:15:15 GMT
Yeah i know move take a while but we have small cluster(3 machines) and 
when few hot regions land in one machine then we had about 
4000/1000/1000 requests on machines after a rebalancing we have now 
about 1900/2200/1900 so its worked :)

maybe someone can write tool to take few regions from overloaded 
regionserver and put on the others, but i have no skills to write it in 
ruby ;)

On 07.02.2011 17:44, tsuna wrote:
> Hey this is pretty cool, I've been wanting something like that for a
> while.  It's a little bit rough but the idea is there.  Ultimately, I
> hope HBase will be able to provide fine-grained metrics on a
> per-region basis and use that to do load balancing.
> The problem right now is that load balancing is very costly for
> clients, because it takes too long to move a region around in a real
> cluster.  The region stays down for several seconds (!).  This is
> really disruptive for high-throughput low-latency user-facing
> applications.  So until we make the region migration process more
> seamless, we can't really have a very aggressive / proactive load
> balancer.
> The other day I suggested an idea to Stack to change the region
> migration process with virtually zero downtime.  It basically involves
> telling the source region server where the region is going to land,
> and telling the destination region server to prepare to receive the
> region.  The source RS would do a first flush and remember the point
> at which it is (there's a generation ID or something already, whatever
> is needed for ACID).  The source RS would send an RPC to the
> destination RS to tell it to start loading whatever was flushed and
> then it would replicate all the edits to the destination RS.  Once
> both RS are in sync, the source RS would block requests to the region
> (by locking it), tell the destination RS about it, and after getting
> the final ACK from it, would update META and send a special NSRE to
> all the clients of the blocked requests.  The special NSRE would
> basically just be like a normal NSRE but it would also say "hint: I
> think the region is now on that RS".  From the clients' point of view,
> the region downtime would be pretty minimal (almost unnoticeable).
> Also, this scheme allows for opportunities such as warming up the
> block cache in the destination RS before it starts serving.


Sebastian Bauer

View raw message