hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Region Splitting for moderate amount of daily data - Improve MapReduce Performance
Date Sat, 16 Apr 2011 01:56:03 GMT
Joe:
Take a look at hbase-3779

I linked my blog to hbase-3609 which reflects most recent development.

Cheers

On Friday, April 15, 2011, Joe Pallas <pallas@cs.stanford.edu> wrote:
>
> On Apr 15, 2011, at 3:22 PM, Stack wrote:
>
>>> The HBase rebalancer, as I understand it, adjusts region assignments, but doesn't
adjust split points (hence, the number of regions).  Maybe that would be a useful feature
for some cases.
>>>
>>
>> What would you suggest Joe?  It currently splits regions down the
>> middle.  You'd instead have a split point that split the requests
>> happening on a region over say, the last five or ten minutes?
>
> I thought the rebalancer was only moving regions among servers, not actively doing splits.
 I guess I was mistaken.
>
> In any case, there could be either a heuristic or a hint from the table description to
handle cases where distribution should be favored (split regions to distribute evenly across
region servers), because the keyspace is sparsely occupied but updates are uniformly distributed
and it's desirable to distribute the update load.
>
> Bear in mind, I'm just speculating—I don't have experience yet with a reasonably sized
workload.
>
> joe
>
>

Mime
View raw message