hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject RE: Incoming Row Distribution Strategy/Algorithm Among Region Servers?
Date Wed, 15 Jun 2011 18:34:10 GMT
As Chris described below, there is an example of #2 this in the book...


In the first scenario the table just grows "naturally."

The key point is that by the time the Put goes to the RegionServer it's already going to the
"right" spot, because the client the client is aware of all the regions and their start/end

-----Original Message-----
From: Christopher Tarnas [mailto:cft@tarnas.org] On Behalf Of Chris Tarnas
Sent: Wednesday, June 15, 2011 1:59 PM
To: user@hbase.apache.org
Subject: Re: Incoming Row Distribution Strategy/Algorithm Among Region Servers?

There are a few ways:

1) Dynamically as data added. You start with one region and all data goes there. When a region
grows to big, it gets split in half. So if a region had keys 1-10 we now have 1-5 and 5-10.

2) Manually at table creation. You can specify your regions ahead of time if you have a good
handle on the data distribution. 


On Jun 15, 2011, at 10:47 AM, Shuja Rehman wrote:

> yeah, i understand this but my question was that who will define the 
> start and stop key of a region server? did u get my point?
> On Wed, Jun 15, 2011 at 9:53 PM, Doug Meil <doug.meil@explorysmedical.com>wrote:
>> This is briefly covered in the client architecture overview...
>> http://hbase.apache.org/book.html#client
>> ... the gist is that as David describes the client talks directly to 
>> the RegionServers, and knows the start/end keys available.
>> -----Original Message-----
>> From: Buttler, David [mailto:buttler1@llnl.gov]
>> Sent: Wednesday, June 15, 2011 12:28 PM
>> To: user@hbase.apache.org
>> Subject: RE: Incoming Row Distribution Strategy/Algorithm Among 
>> Region Servers?
>> Seems pretty simple to me, but I am probably glossing over details:
>> You insert a row with key '3'
>> Hbase has regions (format start key, end key): (0,1), (1,4), (4,10) 
>> Assume three region servers A, B, C holding the corresponding region
>> Your client gets the location of the region server holding the meta 
>> data (from zookeeper) and asks for the region server that is 
>> responsible for key '3'.  It caches this information so that it 
>> doesn't have to ask again for awhile.  It then sends the insert statement to that
region server.
>> Asking for the region server that contains key '3' is probably a 
>> simple binary search, but I haven't looked it up. The client could 
>> likely easily hold the entire list of regions to region server 
>> mappings in memory and do the binary search locally.
>> Dave
>> -----Original Message-----
>> From: Shuja Rehman [mailto:shujamughal@gmail.com]
>> Sent: Wednesday, June 15, 2011 4:25 AM
>> To: user@hbase.apache.org
>> Subject: Incoming Row Distribution Strategy/Algorithm Among Region Servers?
>> Hi,
>> I am wondering if anybody let me know that how Hbase redirects the 
>> input row to particular region server?  What is the exact algorithm 
>> which is used to distribute the incoming rows to particular region 
>> servers?  Can I get detail information/flow diagram about this? e.g 
>> Row1 ->Some Algorithm-> RegionServerX and in this,"Some Algorithm" details
>> Thanks
>> --
>> Regards
>> Shuja-ur-Rehman Baig
>> <http://pk.linkedin.com/in/shujamughal>
> --
> Regards
> Shuja-ur-Rehman Baig
> <http://pk.linkedin.com/in/shujamughal>

View raw message