helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Auto-rebalancing question
Date Thu, 06 Nov 2014 15:27:56 GMT
Thanks Tom. Good observation. The reason Helix moves back the partition is
to maintain equal distribution of locks at all times, if we don't move it
back the node that came back up will be idle. This assumes the number of
replicas is more than number of nodes.

For single partition or in general when the number of numPartitions *
numReplicas < nodes, I agree that moving back is unneccesary. We can think
of changing the algorithm smarter.

Same with second case, I expected minimum movement. Your suggestion makes
sense. Kanak what do you think.

For the single partition use case, I think you can probably use
LeaderStandby model and set the number of replicas to be number of nodes.
In this case, I believe the leader will not move back when the old node
comes back up. Kanak/Jason I believe we made this change some time back.
Correct me if I am wrong.

Kishore G

But I see your point that

On Thu, Nov 6, 2014 at 5:56 AM, Tom Widmer <Tom.Widmer@camcog.com> wrote:

> Hi all,
> Firstly, thanks for open-sourcing this useful and powerful framework!
> Secondly, I have a question about full auto-rebalancing. I have some
> resources that I use with the OnlineOffine state but which only have a
> single partition at the moment (essentially, Helix is just giving me a
> simple leader election to decide who controls the resource - I don’t care
> which participant has it, as long as only one does). However, with full
> auto rebalance, I find that the ‘first’ instance (alphabetically I think)
> always gets the resource when it’s up. So if I take down the first node so
> the partition transfers to the 2nd node, then bring back up the 1st node,
> the resource transfers back unnecessarily.
> Note that this issue also affects multi-partition resources, it’s just a
> bit less noticeable (it means that with 3 nodes and 4 partitions, say, the
> partitions are always allocated 2, 1, 1, so if you have node 1 down and
> hence 0, 2, 2, and then bring up node 1, it unnecessarily moves 2
> partitions to make 2, 1, 1 rather than the minimum move to achieve
> ‘balance’ which would be to move 1 partition from instance 2 or 3 back to
> instance 1.
> I can see the code in question in
> AutoRebalanceStrategy.typedComputePartitionAssignment, where the
> distRemainder is allocated to the first nodes, so that the capacity of all
> nodes is not equal. Is there any reason why the capacity shouldn’t be
> rounded up, so that in the 1 partition case, all live instances have a
> capacity of 1, and in the 3 instances 4 partitions case above, all
> instances have a capacity of 2? This then allows the rebalance algorithm to
> move less around when instances come up or go down, without losing
> ‘balance’.
> For my immediate problem, is there a better way to hold leadership
> elections for non-partitioned resources than to have a resource with a
> partition count of 1? Or should I perhaps use a single partitioned resource
> for all non-partitioned resources and assign them each a partition number?
> Thanks,
> Tom
> This email and any attachments are intended only for the addressees and
> may contain confidential and/or privileged material. Any processing of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended addressees is prohibited. If you have
> received this in error, do not take a copy to your computer or removable
> media, or forward this email. Please contact the sender and delete this
> material. Cambridge Cognition has monitoring and scanning systems in place
> in relation to emails sent and received to: monitor / record business
> communications in order to prevent and detect crime; investigate the use of
> the Company's internal and external email system; and provide evidence of
> compliance with business practices. Company Registration Number 4338746
> Registered address, Tunbridge Court, Tunbridge Lane, Bottisham, Cambridge,
> CB25 9TU, UK

View raw message