helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ming Fang <mingf...@mac.com>
Subject Re: Prevent failback to MASTER after failover
Date Sun, 31 Mar 2013 05:21:52 GMT
Hi Kishore

Our system requires deterministic placement of the MASTER and SLAVE.
This is a sample of the idealstate file we're using

{
    "id": "Cluster",
    "simpleFields":{
        "IDEAL_STATE_MODE":"AUTO",
        "NUM_PARTITIONS": "1",
        "REPLICAS": "2",
        "STATE_MODEL_DEF_REF":"MasterSlave"
    },
    "mapFields":{
    },
    "listFields":{
        "Partition_0" : [ "node_1", "node_2" ]
    }
}

In this example, node_1 is the MASTER.
If node_1 dies then node_2 will take over.
But if node_1 then get restarted, it will try to become MASTER again.
We normally keep the died node down to avoid this problem.
But I was hoping for a more elegant solution.

One solution would be for node_1 to come up and realizes that node_2 has taken over due to
the previous failure.
In that case node_1 will decide to remain as a SLAVE node instead.
Should this be done by the Controller instead?
Should I create a new statemodel other than MASTER/SLAVE?

On Mar 31, 2013, at 12:50 AM, kishore g <g.kishore@gmail.com> wrote:

> Hi MIng,
> 
> There are couple of ways you can achieve that. Before providing an answer, how many partitions
do you have. Did you generate the idealstate yourself or used Helix to come up with initial
idealstate?
> 
> The reason old master tries to become a master again is to distribute the load among
the nodes currently alive. Otherwise the old node that comes back will never become a master
for any partition and will remain idle until another failure happens in the system.
> 
> thanks,
> Kishore G
> 
> 
> On Sat, Mar 30, 2013 at 8:01 PM, Ming Fang <mingfang@mac.com> wrote:
> We're using MASTER SLAVE in AUTO model.
> When the MASTER is killed, the failover is working properly as the SLAVE transitions
to become MASTER.
> However if the failed MASTER is restarted, it will try to become MASTER again.
> This is causing a problem in our business logic.
> Is there a way to prevent the failed instance from becoming MASTER again?
> 


Mime
View raw message