helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Auto-rebalancing question
Date Mon, 10 Nov 2014 06:43:15 GMT
I will try this and get back to you.

On Fri, Nov 7, 2014 at 8:21 AM, Tom Widmer <Tom.Widmer@camcog.com> wrote:

>  On 6 Nov 2014, at 15:27, kishore g <g.kishore@gmail.com> wrote:
>   Thanks Tom. Good observation. The reason Helix moves back the partition
> is to maintain equal distribution of locks at all times, if we don't move
> it back the node that came back up will be idle. This assumes the number of
> replicas is more than number of nodes.
>  I think I get this - if, say, all instances have a capacity of 2, then
> you might end up with some instances containing 2 and some 0, using the
> current rebalancing algorithm, which isn’t what you want (idle node). I
> guess the algorithm would need tweaking to make sure that every node had
> either capacity or capacity-1 partitions, so that those 0’s wouldn’t be
> acceptable in that case and would have partitions moved from nodes with
> full capacity. I could possibly look at making this change for you? I’d
> need info on how to submit patches.
>   For single partition or in general when the number of numPartitions *
> numReplicas < nodes, I agree that moving back is unneccesary. We can think
> of changing the algorithm smarter.
>  Same with second case, I expected minimum movement. Your suggestion
> makes sense. Kanak what do you think.
>  For the single partition use case, I think you can probably use
> LeaderStandby model and set the number of replicas to be number of nodes.
> In this case, I believe the leader will not move back when the old node
> comes back up. Kanak/Jason I believe we made this change some time back.
> Correct me if I am wrong.
>  I had a look at this option, but the problem is that I’d need to
> hard-code the number of instances, which I’d rather avoid. I guess it might
> work if I allocated a number larger than the expected number of nodes I’d
> ever have?
>  I tried setting up a state machine with ’N’ standby nodes, but
> ZKHelixAdmin.rebalance has some checks saying you can only have:
>    - no more than 1 state with an upper bound of 1
>    - no more than 1 state with an upper bound of R
>    - no more than 1 state with an upper bound of N, in which case you
>    can’t have any other states with either R or 1 as their upper bound (which
>    messes up my case, where I’d want 1 leader and (N-1) standbys, ideally)
>  Are those checks definitely all necessary for full-auto mode?
>  Any alternatives other than writing a user-defined rebalancer?
>  Thanks,
>  Tom
>  This email and any attachments are intended only for the addressees and
> may contain confidential and/or privileged material. Any processing of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended addressees is prohibited. If you have
> received this in error, do not take a copy to your computer or removable
> media, or forward this email. Please contact the sender and delete this
> material. Cambridge Cognition has monitoring and scanning systems in place
> in relation to emails sent and received to: monitor / record business
> communications in order to prevent and detect crime; investigate the use of
> the Company's internal and external email system; and provide evidence of
> compliance with business practices. Company Registration Number 4338746
> Registered address, Tunbridge Court, Tunbridge Lane, Bottisham, Cambridge,
> CB25 9TU, UK

View raw message