helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Message throttling of controller behavior unexpectedly when there are multiple constraints
Date Sun, 17 May 2015 03:23:32 GMT
Thanks Hang for the detailed explanation.

Before the MessageSelectionStage, there is a stage that orders the messages
according to the state transition priority list. I think Slave-Master is
always higher priority than offline-slave which makes sense because in
general having a master is probably more important than two slaves.

Can you provide the state transition priority list in your state model
definition. If you think that its important to get node B to Slave state
before promoting node A from Slave to Master, you can change the priority
order. Note: this can be changed dynamically and does not require re
starting the servers.

Another question is what is the reason to have constraint #2 i.e only one
transition per partition at a time.

thanks,
Kishore G



On Sat, May 16, 2015 at 4:48 PM, Hang Qi <hangq.1985@gmail.com> wrote:

> Hi folks,
>
> We found a very strange behavior on message throttling of controller when
> there is multiple constraints. Here is our setup ( we are using
> helix-0.6.4, only one resource )
>
>    - constraint 1: per node constraint, we only allow 3 state transitions
>    happens on one node concurrently.
>    - constraint 2: per partition constraint, we define the state
>    transition priorities in the state model, and only allow one state
>    transition happens on one single partition concurrently.
>
> We are using MasterSlave state model, suppose we have two nodes A, B, each
> has 8 partitions (p0-p7) respectively, and initially both A and B are
> shutdown, and now we start them at the same time (say A is slightly earlier
> than B).
>
> The expected behavior might be
>
>    1. p0, p1, p2 on A starts from Offline -> Slave; p3, p4, p5 on B
>    starts from Offline -> Slave
>
> But the real result is:
>
>    1. p0, p1, p2 on A starts from Offline -> Slave, nothing happens on B
>    2. until p0, p1, p2 all transited to Master state, p3, p4, p5 on A
>    starts from Offline -> Slave; p0, p1, p2 on B starts from Offline -> Slave
>
> As step Offline -> Slave might take long time, this behavior result in
> very long time to bring up these two nodes (long down time result in long
> catch up time as well), though ideally we should not let both nodes down at
> the same time.
>
> Looked at the controller code, the stage and pipeline based implementation
> is well design, very easy to understand and to reason about.
>
> The logic of MessageThrottleStage#throttle,
>
>
>    1. it goes through each messages selected by MessageSelectionStage,
>    2. for each message, it goes through all selected matched constraints,
>    and decrease the quota of each constraints
>    1. if any constraint's quota is less than 0, this message will be
>       marked as throttled.
>
> I think there is something wrong here, the message will take the quota of
> constraints even it is not going to be sent out (throttled). That explains
> our case,
>
>    - all the messages have been generated by the beginning, (p0, A,
>    Offline->Slave), ... (p7, A, Offline->Slave), (p0, B, Offline->Slave), ...,
>    (p7, B, Offline->Slave)
>    - in the messageThrottleStage#throttle
>       - (p0, A, Offline->Slave), (p1, A, Offline->Slave), (p2, A,
>       Offline->Slave) are good, and constraint 1 on A reaches 0, constraint 2 on
>       p0, p1, p2 reaches 0 as well
>       - (p3, A, Offline->Slave), ... (p7, A, Offline->Slave) throttled by
>       constraint 1 on A, also takes the quota of constraint 2 on those partitions
>       as well.
>       - (p0, B, Offline->Slave), ... (p7, B, Offline->Slave) throttled by
>       constraint 2
>       - thus only (p0, A, Offline->Slave), (p1, A, Oflline->Slave), (p2,
>       A, Offline->Slave) has been sent out by controller.
>
> Does that make sense, or is there anything else you can think of to result
> in this unexpected behavior? And is there any work around for it? One thing
> comes into my mind is update constraint 2 to be only one state transition
> is allowed of single partition on certain state transitions.
>
> Thanks very much.
>
> Thanks
> Hang Qi
>

Mime
View raw message