helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Subramanian Raghunathan <subramanian.raghunat...@integral.com>
Subject RE: State transitions of partitions and Distributed Cluster controllers
Date Fri, 29 Jan 2016 02:52:22 GMT
Hi Helix Team,

@Kishore
This  definitely helps a lot. Thanks.

Yes, reset will be called when you lose zk session. It will also be invoked when a partition
goes to ERROR state and you want to get back to OFFLINE state. ( I am not 100% sure if reset
api is invoked or ERROR to OFFLINE transition is invoked). Jason might be able to answer that.

The question of mine was the event life cycle.
For E.G : When session is lost say reset() method is invoked
                  On further session reconnection based on the state model will the corresponding
handler method be notified?

Thanks & Regards,
Subramanian Raghunathan.

From: kishore g [mailto:g.kishore@gmail.com]
Sent: Wednesday, January 27, 2016 11:31 AM
To: user@helix.apache.org
Cc: user@helix.incubator.apache.org; dev@helix.incubator.apache.org
Subject: Re: State transitions of partitions and Distributed Cluster controllers

thanks for sending it again.

I looked at the code, even though the retry is handled on the participant. Looks like we are
not setting it for state transition message. We do have this ability to set it for custom
message type.

Fix is easy, we just need to set message.setRetryCount in this class

https://github.com/apache/helix/blob/9e51cb7bdf8424df46c6fa353e7c80d984c21193/helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationStage.java

We can read the retry count from cluster config.

There was another email I had recently sent with instructions to set up distributed controller.
In short the steps are

helixadmin create-cluster super_cluster
helixadmin addInstance super_cluster  controller1
helixadmin addInstance super_cluster  controller2
helixadmin addInstance super_cluster  controller3

start the three controller in distributed mode and provide super_cluster as the cluster name.

Now any time you create a cluster, you can add that cluster as a resource in the super_cluster.
One of the controllers will automatically start managing the new cluster. For e.g.
helixadmin create-cluster cluster1
helixadmin addresource super-cluster cluster1 AUTO mode leaderstandbymodel

I don't remember the exact commands on top of my head but it should look something like that.

Yes, reset will be called when you lose zk session. It will also be invoked when a partition
goes to ERROR state and you want to get back to OFFLINE state. ( I am not 100% sure if reset
api is invoked or ERROR to OFFLINE transition is invoked). Jason might be able to answer that.

Hope that helps.


On Wed, Jan 27, 2016 at 10:51 AM, Subramanian Raghunathan <subramanian.raghunathan@integral.com<mailto:subramanian.raghunathan@integral.com>>
wrote:
Hi Helix Team ,

                I am evaluating helix as a cluster management framework. I believe it’s
very modular , highly customizable with a variety of out of box capabilities. Kudos to the
team !

I have the below queries :


1)      How to configure the number of retries  in state transition handlers ?

http://markmail.org/message/vgc4nksocolqiqx5
                I referred to the this particular mail conversion : “you can configure the
number of retries before a transition is considered as failed”


2)       Please point me to an example/interfaces of starting a distributed cluster controller
and how to add the various clusters that the controllers is supposed to manage.


3)      What would be the event life cycle of the reset() method in TransitionHandler

a.       Believe this gets called if zookeeper client session is lost or there’s an update
to the cluster configuration

Note: I am using the “helix-0.7.1” version.

Thanks & Regards,
Subramanian Raghunathan

Mime
View raw message