hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Hunt (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-335) zookeeper servers should commit the new leader txn to their logs.
Date Thu, 17 Jun 2010 23:34:23 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880001#action_12880001
] 

Patrick Hunt commented on ZOOKEEPER-335:
----------------------------------------

Thanks for the log Mike. This issue does seem similar to what Charity reported:

2010-06-17 14:35:34,263 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch
1 is less than our epoch 2

Unfortunately the attached log shows information only after the problem occurred. Any chance
you could upload the logs during the initial event? (what I mean is when the problem originally
started) Also the logs from the other servers in the ensemble (again, at the time that the
problem originally occurred) would really help. Thanks.

Have you been able to clear the problem? It's fairly straightforward to resolve - Charity
resolved by; 1) bring down the failing server, 2) clear the data directory of that server
(only), 3) start that server. You only want to do this for the server that's unable to rejoin
the quorum - ie the one thats outputting "Leader epoch 1 is less than our epoch 2", _not_
for all servers in the ensemble.

> zookeeper servers should commit the new leader txn to their logs.
> -----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-335
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-335
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.0
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.4.0
>
>         Attachments: zk.log.gz
>
>
> currently the zookeeper followers do not commit the new leader election. This will cause
problems in a failure scenarios with a follower acking to the same leader txn id twice, which
might be two different intermittent leaders and allowing them to propose two different txn's
of the same zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message