helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Prevent failback to MASTER after failover
Date Thu, 09 May 2013 17:04:17 GMT
Yes, another place where we need the api to be more descriptive. We should
dedicate one release to clean up our apis and add javadocs.

With regards to your problem: even though add throttling constraint works,
its really a work around and not really elegant. Can you please file a jira
for this and explain the problem. We need to think of a better solution.

Thanks,
Kishore G


On Thu, May 9, 2013 at 9:33 AM, Ming Fang <mingfang@mac.com> wrote:

> Thanks for adding the test case.
> So looks like I just have to remove the INSTANCE constraint.
>
>
> Sent from my iPad
>
> On May 8, 2013, at 7:18 PM, Zhen Zhang <nehzgnahz@gmail.com> wrote:
>
> Hi Ming, I've added a test case for this, see TestMessageThrottle2.java.
> It is just a copy of your example with minor changes.
>
>
> https://github.com/apache/incubator-helix/blob/master/helix-core/src/test/java/org/apache/helix/integration/TestMessageThrottle2.java
>
>
> At step 3) when you are adding Node-1, there are three state transition
> messages need to be sent:
> T1) Offline->Slave for Node-1
> T2) Master->Slave for Node-2
> T3) Slave->Master for Node-1
>
> Note that T1 and T2 can be sent together. If you are using instance level
> constraint like this:
>    // limit one transition message at a time for each instance
>     builder.addConstraintAttribute("MESSAGE_TYPE", "STATE_TRANSITION")
>     .addConstraintAttribute("INSTANCE", ".*")
>                    .addConstraintAttribute("CONSTRAINT_VALUE", "1");
>
> Then T1 and T2 will be sent together in the first round since T1 and T2
> are sent to two different nodes. And T3 will be sent in the next round.
>
> If you are specifying a cluster level constraint like this:
>     // limit one transition message at a time for the entire cluster
>     builder.addConstraintAttribute("MESSAGE_TYPE", "STATE_TRANSITION")
>               .addConstraintAttribute("CONSTRAINT_VALUE", "1");
>
> Then helix controller will send T1 in the first round; then send T2; then
> T3. The reason why T1 is sent before T2 is because in the state model
> definition, you specified that Offline->Slave transition has a higher
> priority than Master->Slave.
>
> The test runs without problem. Here is the output:
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Start zookeeper at localhost:2183 in thread main
> START TestMessageThrottle2 at Wed May 08 15:57:21 PDT 2013
> Creating cluster: TestMessageThrottle2
> Starting Controller{Cluster:TestMessageThrottle2, Port:12000,
> Zookeeper:localhost:2183}
> StatusPrinter.onIdealStateChange:state = MyResource,
> {IDEAL_STATE_MODE=AUTO, NUM_PARTITIONS=1, REPLICAS=2,
> STATE_MODEL_DEF_REF=MasterSlave}{}{MyResource=[node1, node2]}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{}{}
>
> StatusPrinter.onControllerChange:org.apache.helix.NotificationContext@6e3404f
> StatusPrinter.onInstanceConfigChange:instanceConfig = node2,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onLiveInstanceChange:liveInstance = node2,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60006}{}{}
> StatusPrinter.onIdealStateChange:state = MyResource,
> {IDEAL_STATE_MODE=AUTO, NUM_PARTITIONS=1, REPLICAS=2,
> STATE_MODEL_DEF_REF=MasterSlave}{}{MyResource=[node1, node2]}
> StatusPrinter.onInstanceConfigChange:instanceConfig = node2,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{}{}
> StatusPrinter.onLiveInstanceChange:liveInstance = node2,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60006}{}{}
>
> StatusPrinter.onControllerChange:org.apache.helix.NotificationContext@76d3046
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node2=MASTER}}{}
> StatusPrinter.onInstanceConfigChange:instanceConfig = node1,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onInstanceConfigChange:instanceConfig = node2,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node2=MASTER}}{}
> StatusPrinter.onInstanceConfigChange:instanceConfig = node1,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onInstanceConfigChange:instanceConfig = node2,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onLiveInstanceChange:liveInstance = node1,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60008}{}{}
> StatusPrinter.onLiveInstanceChange:liveInstance = node2,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60006}{}{}
> StatusPrinter.onIdealStateChange:state = MyResource,
> {IDEAL_STATE_MODE=AUTO, NUM_PARTITIONS=1, REPLICAS=2,
> STATE_MODEL_DEF_REF=MasterSlave}{}{MyResource=[node1, node2]}
> StatusPrinter.onInstanceConfigChange:instanceConfig = node1,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onInstanceConfigChange:instanceConfig = node2,
> {HELIX_ENABLED=true, HELIX_HOST=localhost}{}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node2=MASTER}}{}
> StatusPrinter.onLiveInstanceChange:liveInstance = node1,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60008}{}{}
> StatusPrinter.onLiveInstanceChange:liveInstance = node2,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60006}{}{}
>
> StatusPrinter.onControllerChange:org.apache.helix.NotificationContext@b9deddb
> StatusPrinter.onLiveInstanceChange:liveInstance = node1,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60008}{}{}
> StatusPrinter.onLiveInstanceChange:liveInstance = node2,
> {HELIX_VERSION=${project.version}, LIVE_INSTANCE=11881@zzhang-mn1,
> SESSION_ID=13e865cfca60006}{}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node1=SLAVE, node2=MASTER}}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node1=SLAVE, node2=MASTER}}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node1=MASTER, node2=SLAVE}}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node1=MASTER, node2=SLAVE}}{}
> StatusPrinter.onExternalViewChange:externalView = MyResource,
> {BUCKET_SIZE=0}{MyResource={node1=MASTER, node2=SLAVE}}{}
> true: wait 489ms,
> ClusterStateVerifier$BestPossAndExtViewZkVerifier(TestMessageThrottle2@localhost
> :2183)
> END TestMessageThrottle2 at Wed May 08 15:57:30 PDT 2013
>
> Thanks,
> Jason
>
>
>
>
> On Tue, May 7, 2013 at 8:25 PM, Ming Fang <mingfang@mac.com> wrote:
>
>> Here is the code that I'm using to test
>> https://github.com/mingfang/apache-helix/tree/master/helix-example
>>
>> In ZAC.java line 134 is where I'm adding the constraint.
>> Line 204 is where I'm setting the state transition priority list.
>>
>> The steps I'm using is
>> 1-Run ZAC and wait for the StatusPrinter printouts
>> 2-Run Node2 and wait for it to transition to MASTER
>> 3-Run Node1
>> At this point we see the problem where the external view will say
>> node1=SLAVE and node2=SLAVE.
>>
>> I can get the MessageThrottleStage to work by replacing line 205 with this
>>           String key=item.toString();
>> But even with message throttle working I can can't get the transition
>> sequence I need.
>>
>>
>> On May 7, 2013, at 11:43 AM, kishore g <g.kishore@gmail.com> wrote:
>>
>> Can you give provide the code snippet you used to add the constraint.
>> Looks like you are setting constraint at INSTANCE level.
>>
>>
>>
>>
>> On Mon, May 6, 2013 at 9:52 PM, Ming Fang <mingfang@mac.com> wrote:
>>
>>> I almost have this working.
>>> However I'm experiencing a potential bug in MessageThrottleStage line
>>> 205.
>>> The problem is that the throttleMap's key contains the INSTANCE=<id> in
>>> it.
>>> This effectively makes trying to throttle across the entire cluster
>>> impossible.
>>>
>>> On Apr 24, 2013, at 2:07 PM, Zhen Zhang <zzhang@linkedin.com> wrote:
>>>
>>> > Hi Ming, to set the constraint so that only one transition message at a
>>> > time, you can take a look at the test example of TestMessageThrottle.
>>> You
>>> > need to add a message constraint as follows:
>>> >
>>> > // build a message constraint
>>> > ConstraintItemBuilder builder = new ConstraintItemBuilder();
>>> > builder.addConstraintAttribute("MESSAGE_TYPE", "STATE_TRANSITION")
>>> >   .addConstraintAttribute("INSTANCE", ".*")
>>> >   .addConstraintAttribute("CONSTRAINT_VALUE", "1");
>>> >
>>> > // add the constraint to the cluster
>>> > helixAdmin.setConstraint(clusterName,
>>> ConstraintType.MESSAGE_CONSTRAINT,
>>> > "constraint1", builder.build());
>>> >
>>> >
>>> > Message constraint is separate from ideal state and is not specified in
>>> > the JSON file of the ideal state.
>>> >
>>> > Thanks,
>>> > Jason
>>> >
>>> >
>>> >
>>> >
>>> > On 4/23/13 2:40 PM, "Ming Fang" <mingfang@mac.com> wrote:
>>> >
>>> >> Kishore
>>> >>
>>> >> It sounds like the solution is to set the constraints so that only one
>>> >> transition at a time.
>>> >> Can you point me to an example of how to do this?
>>> >> Also is this something I can set in the JSON file?
>>> >>
>>> >> Sent from my iPad
>>> >>
>>> >> On Apr 1, 2013, at 11:32 AM, kishore g <g.kishore@gmail.com> wrote:
>>> >>
>>> >>> Hi Ming,
>>> >>>
>>> >>> Thanks for the detailed explanation. Actually 5 & 6  happen
in
>>> >>> parallel, Helix tries to parallelize the transitions as much as
>>> possible.
>>> >>>
>>> >>> There is another feature in Helix that allows you to sort the
>>> >>> transitions based on some priority.See
>>> STATE_TRANSITION_PRIORITY_LIST in
>>> >>> state model definition. But after sorting Helix will send as many
as
>>> >>> possible in parallel without violating constraints.
>>> >>>
>>> >>> In your case you want the priority to be S-M, O-S, M-S but that
is
>>> not
>>> >>> sufficient since O-S and M-S will be sent in parallel.
>>> >>>
>>> >>> Additionally, what you need to do is set contraint on transition
that
>>> >>> there should be only one transition per partition at any time. This
>>> will
>>> >>> basically make the order 6 5 7 and they will be executed sequentially
>>> >>> per partition.
>>> >>>
>>> >>> We will try this  out and let you know, you dont need to change
any
>>> >>> code in Helix or your app. You should be able to tweak the
>>> configuration
>>> >>> dynamically.
>>> >>>
>>> >>> We will try to think of solving this in a more elegant way. I will
>>> file
>>> >>> a jira and add more info.
>>> >>>
>>> >>> I also want to ask this question, when a node comes up if it is
>>> >>> mandatory to talk to MASTER what happens when the nodes are started
>>> for
>>> >>> the first time or when all nodes crash and come back.
>>> >>>
>>> >>> thanks,
>>> >>> Kishore G
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >
>>>
>>>
>>
>>
>

Mime
View raw message