hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: StochasticLoadBalancer questions
Date Fri, 13 Jan 2017 18:27:12 GMT
Logged HBASE-17462 for #2.

FYI

On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> For #2, I think MemstoreSizeCostFunction belongs to the same category if
> we are to adopt moving average.
>
> Some factors to consider:
>
> The data structure used by StochasticLoadBalancer should be concise. The
> number of regions in a cluster can be expected to approach 1 million. We
> cannot afford to store long history of read / write requests in master.
>
> Efficiency of cost calculation should be high - there're many cost
> functions the balancer goes through, it is expected for each cost function
> to return quickly. Otherwise we would not come up with proper region
> movement plan(s) in time.
>
> Cheers
>
> On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> For #2, I think it makes sense to try out using request rates for cost
>> calculation.
>>
>> If the experiment result turns out to be better, we can consider using
>> such measure.
>>
>> Thanks
>>
>> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown <tim@siftscience.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have a couple of questions about the StochasticLoadBalancer.
>>>
>>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is weights
>>> later samples of the RegionLoad more than previous ones. For example,
>>> with
>>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 +
>>> .125*load4). Is this the intended behavior?
>>>
>>> 2) Would it make more sense to calculate the ReadRequestCost and
>>> WriteRequestCost as rates? Right now it looks like the cost is just based
>>> off the total number of read/write requests a region has gotten over its
>>> lifetime.
>>>
>>> -Tim
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message