helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kanak Biscuitwala <kana...@hotmail.com>
Subject RE: Favoring some transitions when rebalancing in full_auto mode
Date Tue, 22 Oct 2013 06:42:18 GMT
I need to verify this, but I suspect two things are going on, having just taken a quick look
at the code:

1) I wrote some code a while back to rearrange node preference order if what was calculated
did not sufficiently balance the number replicas in state s across nodes. I suspect this code
is causing the problem.

2) The algorithm's initial assignment ignores preferred placement altogether, and just places
everything uniformly by a hash. This is because the algorithm treats all replicas as orphans
on the first run. Subsequent rebalances improve the situation as the algorithm never removes
preferred replicas from their nodes. I think this should probably be changed so that the preferred
replicas are placed first, especially if len(liveNodes) == len(allNodes).

3) If nodes are configured and launched at the same time, the preferred placement is not necessarily
static, though the hashing scheme is probably flexible enough to allow for this.

I'll investigate in the morning.

Date: Mon, 21 Oct 2013 23:21:21 -0700
Subject: Re: Favoring some transitions when rebalancing in full_auto mode
From: g.kishore@gmail.com
To: user@helix.incubator.apache.org

Kanak, I thought this should be the default behavior. When the list of participants is generated
for each partition, it comprises of 
preferred participants. i.e if all nodes were up where would this partition reside 
non preferred participants. i.e when one of preferred participant is down we select a non
preferred participantIf the list we generate ensures that preferred participants are put ahead
of non-preferred, the behavior Matthieu is expecting should happen by default without additional

Am i missing something ?

On Fri, Oct 18, 2013 at 11:03 AM, Matthieu Morel <mmorel@apache.org> wrote:

Thanks Kanak for the explanation.
It will definitely be very useful to have a few more knobs for tuning the rebalancing algorithm.
I'll post a ticket soon.

On Oct 18, 2013, at 19:16 , Kanak Biscuitwala <kbiscuitwala@linkedin.com> wrote:

Currently, the FULL_AUTO algorithm does not take this into account. The algorithm optimizes
for minimal movement and even distribution of states. What I see here is that there is a tie
in terms of even distribution, and current presence of the replica
 would be a good tiebreaker. I can see why this would be useful, though. Please create an
issue and we'll pick it up when we're able.

On a somewhat related note, I noticed in your example code that you configure and launch your
nodes at the same time. The FULL_AUTO rebalancer performs better when you configure your nodes
ahead of time (even if you specify more than you actually ever
 start). This is, of course, optional.
Thanks for the advice. Currently we expect Helix to recompute states and partitions as nodes
join the cluster, though indeed it's probably more efficient to compute some of the schedule
ahead of time. I'll see how to apply your suggestion.

Best regards,


From: Matthieu Morel <mmorel@apache.org>

Reply-To: "user@helix.incubator.apache.org" <user@helix.incubator.apache.org>

Date: Friday, October 18, 2013 10:03 AM

To: "user@helix.incubator.apache.org" <user@helix.incubator.apache.org>

Subject: Favoring some transitions when rebalancing in full_auto mode


In FULL_AUTO mode, helix computes both partitioning and states.

In a leader-replica model, I observed that when rebalancing due to a failure of the Leader
node, Helix does not promote an existing replica to leader, but instead assigns a new leader
(I.e. going
 from offline to replica to leader).

For quick failover, we need to have the replica promoted to leader instead. Is there a way
to do so in FULL_AUTO mode?

Apparently with SEMI_AUTO that would be possible, but it would imply we control the partitioning,
and we'd prefer Helix to control that as well.

I tried to play with the priorities in the definition of the state model, with no luck so

(See the example below for an example of how rebalancing currently takes place)



Here we have a deployment with 3 nodes, 3 partitions and 2 desired states, Leader and Replica
(and offline).

// initial states

      ,"instance_2":"REPLICA"  // Instance2 is replica

// instance 0 dies

      "instance_1":"LEADER" // Helix preferred to assign leadership of resource 2 to instance
1 rather than promoting instance_2 from replica to leader
      ,"instance_2":"REPLICA" // instance 2 is still replica for resource 2

View raw message