hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Bauer <ad...@ugame.net.pl>
Subject Re: requests count per HRegion and rebalance
Date Mon, 07 Feb 2011 17:26:25 GMT
its trunk version:

Path: .
URL: https://svn.apache.org/repos/asf/hbase/trunk
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1067865
Node Kind: directory
Schedule: normal
Last Changed Author: stack
Last Changed Rev: 1067598
Last Changed Date: 2011-02-06 06:46:39 +0100 (Sun, 06 Feb 2011)

On 07.02.2011 18:20, Ted Yu wrote:
> What version of HBase do you use ?
> Looking at the line numbers, you are using version older than 0.90
>
> Also, incrementing counter in startRegionOperation() doesn't consider the
> following call (in 0.90 RC3):
>    public Integer obtainRowLock(final byte [] row) throws IOException {
>      startRegionOperation();
>
> Regards
>
> On Mon, Feb 7, 2011 at 9:15 AM, Sebastian Bauer<admin@ugame.net.pl>  wrote:
>
>> Yeah i know move take a while but we have small cluster(3 machines) and
>> when few hot regions land in one machine then we had about 4000/1000/1000
>> requests on machines after a rebalancing we have now about 1900/2200/1900 so
>> its worked :)
>>
>> maybe someone can write tool to take few regions from overloaded
>> regionserver and put on the others, but i have no skills to write it in ruby
>> ;)
>>
>>
>> On 07.02.2011 17:44, tsuna wrote:
>>
>>> Hey this is pretty cool, I've been wanting something like that for a
>>> while.  It's a little bit rough but the idea is there.  Ultimately, I
>>> hope HBase will be able to provide fine-grained metrics on a
>>> per-region basis and use that to do load balancing.
>>>
>>> The problem right now is that load balancing is very costly for
>>> clients, because it takes too long to move a region around in a real
>>> cluster.  The region stays down for several seconds (!).  This is
>>> really disruptive for high-throughput low-latency user-facing
>>> applications.  So until we make the region migration process more
>>> seamless, we can't really have a very aggressive / proactive load
>>> balancer.
>>>
>>> The other day I suggested an idea to Stack to change the region
>>> migration process with virtually zero downtime.  It basically involves
>>> telling the source region server where the region is going to land,
>>> and telling the destination region server to prepare to receive the
>>> region.  The source RS would do a first flush and remember the point
>>> at which it is (there's a generation ID or something already, whatever
>>> is needed for ACID).  The source RS would send an RPC to the
>>> destination RS to tell it to start loading whatever was flushed and
>>> then it would replicate all the edits to the destination RS.  Once
>>> both RS are in sync, the source RS would block requests to the region
>>> (by locking it), tell the destination RS about it, and after getting
>>> the final ACK from it, would update META and send a special NSRE to
>>> all the clients of the blocked requests.  The special NSRE would
>>> basically just be like a normal NSRE but it would also say "hint: I
>>> think the region is now on that RS".  From the clients' point of view,
>>> the region downtime would be pretty minimal (almost unnoticeable).
>>> Also, this scheme allows for opportunities such as warming up the
>>> block cache in the destination RS before it starts serving.
>>>
>>>
>> --
>>
>> Pozdrawiam
>> Sebastian Bauer
>> -----------------------------------------------------
>> http://tikecik.pl
>>
>>


-- 

Pozdrawiam
Sebastian Bauer
-----------------------------------------------------
http://tikecik.pl


Mime
View raw message