helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kanak Biscuitwala (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HELIX-321) Controller forgets that it's the leader
Date Mon, 25 Nov 2013 17:34:35 GMT

     [ https://issues.apache.org/jira/browse/HELIX-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kanak Biscuitwala updated HELIX-321:
------------------------------------

    Fix Version/s: 0.7.1-incubating

> Controller forgets that it's the leader
> ---------------------------------------
>
>                 Key: HELIX-321
>                 URL: https://issues.apache.org/jira/browse/HELIX-321
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Kanak Biscuitwala
>            Assignee: Kanak Biscuitwala
>             Fix For: 0.7.1-incubating
>
>         Attachments: leader_election.txt
>
>
> 1. See log messages:
> INFO [2013-11-22 17:34:11,919] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn
- Client session timed out, have not heard from server in 20171ms for sessionid 0x142016175c10856,
closing socket connection and attempting reconnect
> INFO [2013-11-22 17:34:22,051] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn
- Opening socket connection to server eat1-app87.corp/172.18.158.133:2181
> INFO [2013-11-22 17:34:22,052] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn
- Socket connection established to eat1-app87.corp/172.18.158.133:2181, initiating session
> INFO [2013-11-22 17:34:22,055] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn
- Unable to reconnect to ZooKeeper service, session 0x142016175c10856 has expired, closing
socket connection
> INFO [2013-11-22 17:34:22,055] main-EventThread - org.I0Itec.zkclient.ZkClient - zookeeper
state changed (Expired)
> INFO [2013-11-22 17:34:22,055] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixConnection
- KeeperState:Expired, expiredSessionId: 142016175c10856
> 2. Controller reconnects, removes all callbacks
> INFO [2013-11-22 17:34:22,068] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn
- Socket connection established to eat1-app87.corp/172.18.158.133:2181, initiating session
> INFO [2013-11-22 17:34:22,126] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn
- Session establishment complete on server eat1-app87.corp/172.18.158.133:2181, sessionid
= 0x142016175c1085c, negotiated timeout = 30000
> INFO [2013-11-22 17:34:22,126] main-EventThread - org.I0Itec.zkclient.ZkClient - zookeeper
state changed (SyncConnected)
> 3. Callbacks ignored; not leader, relenquishes leadership
> ERROR [2013-11-22 17:34:22,187] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.controller.GenericHelixController
- Cluster manager: controller1 is not leader. Pipeline will not be invoked
> INFO [2013-11-22 17:34:22,200] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection
- controller1 reqlinquishes leadership of cluster: perf-test-cluster
> 4. Controller reacquires leadership
> INFO [2013-11-22 17:34:22,204] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection
- controller1 is trying to acquire leadership for cluster: perf-test-cluster
> INFO [2013-11-22 17:34:22,215] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection
- controller1 acquires leadership of cluster: perf-test-cluster
> 4. Controller thinks it's not leader even though the LEADER node is in place and correct
> ERROR [2013-11-22 17:34:22,294] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.controller.GenericHelixController
- Cluster manager: controller1 is not leader. Pipeline will not be invoked
> 5. Controller tries to become leader when it already is???
> INFO [2013-11-22 17:34:22,335] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection
- controller1 is trying to acquire leadership for cluster: perf-test-cluster
> Logs attached



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message