helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kishore gopalakrishna (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HELIX-29) Not receiving transitions after participant reconnection
Date Tue, 19 Mar 2013 20:57:17 GMT

     [ https://issues.apache.org/jira/browse/HELIX-29?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

kishore gopalakrishna resolved HELIX-29.
----------------------------------------

    Resolution: Fixed
    
> Not receiving transitions after participant reconnection
> --------------------------------------------------------
>
>                 Key: HELIX-29
>                 URL: https://issues.apache.org/jira/browse/HELIX-29
>             Project: Apache Helix
>          Issue Type: Bug
>    Affects Versions: 0.6.0-incubating
>            Reporter: Santiago Perez
>            Assignee: dafu
>             Fix For: 0.6.1-incubating
>
>
> We have nodes that due to long GC pauses have their ZK connections expire. We're handling
the expiration and disconnecting the participant and reconnecting it aferwards. Usually this
means the state gets reset to IDLE and we get the proper transitions to the ideal state (in
this case ONLINE).
> However, sometimes we don't get any transitions at all although the disconnection and
reconnection are successful. One interesting side effect is that the IDEALSTATE for that node
remains ONLINE, the EXTERNALVIEW remains ONLINE, yet the CURRENTSTATE shows IDLE, and no transitions
are sent back to the participant.
> Here are the ZK contents for one of this nodes:
> [zk: localhost:2122(CONNECTED) 41] get /<NAMESPACE>/<CLUSTER>/INSTANCES/<PARTICIPANT-NAME>/CURRENTSTATES/338bfded5e60877/<RESOURCE-NAME>
> {
>   "id":"<RESOURCE-NAME>"
>   ,"simpleFields":{
>     "BUCKET_SIZE":"0"
>     ,"SESSION_ID":"338bfded5e60877"
>     ,"STATE_MODEL_DEF":"Bootstrap"
>     ,"STATE_MODEL_FACTORY_NAME":"<FACTORY-NAME>"
>   }
>   ,"listFields":{
>   }
>   ,"mapFields":{
>     "<RESOURCE-NAME>_17":{
>       "CURRENT_STATE":"IDLE"
>     }
>   }
> }
> cZxid = 0x2010d26c8
> ctime = Sun Jan 20 03:14:57 PST 2013
> mZxid = 0x2010d26f5
> mtime = Sun Jan 20 03:14:58 PST 2013
> pZxid = 0x2010d26c8
> cversion = 0
> dataVersion = 2
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 281
> numChildren = 0
> [zk: localhost:2122(CONNECTED) 42] get /<NAMESPACE>/<CLUSTER>/EXTERNALVIEW/<RESOURCE-NAME>
                                
> {
>   "id" : "<RESOURCE-NAME>",
>   "simpleFields" : {
>     "BUCKET_SIZE" : "0"
>   },
>   "mapFields" : {
>      
>     ... PREVIOUS PARTITIONS ...
>     "<RESOURCE-NAME>_17" : {
>       "<PARTICIPANT>" : "ONLINE"
>     },
>     ... FOLLOWING PARTITIONS ...
>   },
>   "listFields" : {
>   }
> }
> cZxid = 0x200595a78
> ctime = Thu Nov 08 18:06:23 PST 2012
> mZxid = 0x201077ec6
> mtime = Fri Jan 18 16:40:03 PST 2013
> pZxid = 0x200595a78
> cversion = 0
> dataVersion = 4666
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 3367
> numChildren = 0
> The ideal state is very similar to the EXTERNALVIEW, if you want I can post that too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message