hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: StochasticLoadBalancer questions
Date Fri, 13 Jan 2017 22:10:54 GMT
For #2, you're more than welcome to attach patch on the JIRA.

For #1, last time I tried to trace which JIRA introduced the formula but
ended up with one Elliott did which just moved that line of code.
I can spend more time in the future on this.

What downside have you observed for #1 ?

Cheers

On Fri, Jan 13, 2017 at 2:07 PM, Timothy Brown <tim@siftscience.com> wrote:

> I tried it out on our staging cluster and saw that the total number of
> requests per region server a bit more balanced with our current weights for
> the read and write costs. I did not attempt to calculate the exact requests
> per second but rather looked at a relative rate by averaging the increase
> in reads and writes over the interval that the RegionLoad is currently
> polled. This should have the same desired effect of balancing the number of
> requests across the cluster. If you don't mind, I would like to take a stab
> at the JIRA you've created.
>
> For #1, any idea if this is the desired behavior?
>
> Thanks,
> Tim
>
> On Fri, Jan 13, 2017 at 10:27 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > Logged HBASE-17462 for #2.
> >
> > FYI
> >
> > On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > For #2, I think MemstoreSizeCostFunction belongs to the same category
> if
> > > we are to adopt moving average.
> > >
> > > Some factors to consider:
> > >
> > > The data structure used by StochasticLoadBalancer should be concise.
> The
> > > number of regions in a cluster can be expected to approach 1 million.
> We
> > > cannot afford to store long history of read / write requests in master.
> > >
> > > Efficiency of cost calculation should be high - there're many cost
> > > functions the balancer goes through, it is expected for each cost
> > function
> > > to return quickly. Otherwise we would not come up with proper region
> > > movement plan(s) in time.
> > >
> > > Cheers
> > >
> > > On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > >> For #2, I think it makes sense to try out using request rates for cost
> > >> calculation.
> > >>
> > >> If the experiment result turns out to be better, we can consider using
> > >> such measure.
> > >>
> > >> Thanks
> > >>
> > >> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown <tim@siftscience.com>
> > >> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I have a couple of questions about the StochasticLoadBalancer.
> > >>>
> > >>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is
> weights
> > >>> later samples of the RegionLoad more than previous ones. For example,
> > >>> with
> > >>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3
+
> > >>> .125*load4). Is this the intended behavior?
> > >>>
> > >>> 2) Would it make more sense to calculate the ReadRequestCost and
> > >>> WriteRequestCost as rates? Right now it looks like the cost is just
> > based
> > >>> off the total number of read/write requests a region has gotten over
> > its
> > >>> lifetime.
> > >>>
> > >>> -Tim
> > >>>
> > >>
> > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message