helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Widmer <Tom.Wid...@camcog.com>
Subject Re: Auto-rebalancing question
Date Fri, 07 Nov 2014 16:21:36 GMT
On 6 Nov 2014, at 15:27, kishore g <g.kishore@gmail.com<mailto:g.kishore@gmail.com>>

Thanks Tom. Good observation. The reason Helix moves back the partition is to maintain equal
distribution of locks at all times, if we don't move it back the node that came back up will
be idle. This assumes the number of replicas is more than number of nodes.

I think I get this - if, say, all instances have a capacity of 2, then you might end up with
some instances containing 2 and some 0, using the current rebalancing algorithm, which isn’t
what you want (idle node). I guess the algorithm would need tweaking to make sure that every
node had either capacity or capacity-1 partitions, so that those 0’s wouldn’t be acceptable
in that case and would have partitions moved from nodes with full capacity. I could possibly
look at making this change for you? I’d need info on how to submit patches.

For single partition or in general when the number of numPartitions * numReplicas < nodes,
I agree that moving back is unneccesary. We can think of changing the algorithm smarter.

Same with second case, I expected minimum movement. Your suggestion makes sense. Kanak what
do you think.

For the single partition use case, I think you can probably use LeaderStandby model and set
the number of replicas to be number of nodes. In this case, I believe the leader will not
move back when the old node comes back up. Kanak/Jason I believe we made this change some
time back. Correct me if I am wrong.

I had a look at this option, but the problem is that I’d need to hard-code the number of
instances, which I’d rather avoid. I guess it might work if I allocated a number larger
than the expected number of nodes I’d ever have?

I tried setting up a state machine with ’N’ standby nodes, but ZKHelixAdmin.rebalance
has some checks saying you can only have:

  *   no more than 1 state with an upper bound of 1
  *   no more than 1 state with an upper bound of R
  *   no more than 1 state with an upper bound of N, in which case you can’t have any other
states with either R or 1 as their upper bound (which messes up my case, where I’d want
1 leader and (N-1) standbys, ideally)

Are those checks definitely all necessary for full-auto mode?

Any alternatives other than writing a user-defined rebalancer?


This email and any attachments are intended only for the addressees and may contain confidential
and/or privileged material. Any processing of, or taking of any action in reliance upon, this
information by persons or entities other than the intended addressees is prohibited. If you
have received this in error, do not take a copy to your computer or removable media, or forward
this email. Please contact the sender and delete this material. Cambridge Cognition has monitoring
and scanning systems in place in relation to emails sent and received to: monitor / record
business communications in order to prevent and detect crime; investigate the use of the Company's
internal and external email system; and provide evidence of compliance with business practices.
Company Registration Number 4338746 Registered address, Tunbridge Court, Tunbridge Lane, Bottisham,
Cambridge, CB25 9TU, UK

View raw message