zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@apache.org>
Subject Re: The current epoch, 7, is older than the last zxid, 8589935882
Date Mon, 26 Aug 2019 15:51:28 GMT
I confirmed that the fix is included in 3.4.13. That’s why I asked if you can see ‘updatingEpoch’
file in the data folder. 

I don’t think the issue is not related, but I want to make sure that you’re running the
right version by verifying the beginning of ZK logs.

Andor



> On 2019. Aug 26., at 13:43, Debraj Manna <subharaj.manna@gmail.com> wrote:
> 
> Below is the content of currentEpoch.tmp
> 
> support@platform2:/var/lib/zookeeper/version-2$ sudo cat acceptedEpoch
> 8support@platform2:/var/lib/zookeeper/version-2$ sudo cat currentEpoch
> 7support@platform2:/var/lib/zookeeper/version-2$ sudo cat currentEpoch.tmp
> 8support@platform2
> 
> Starting zookeeper logs are rolled over as the issue was there for some
> time. Will the current log with the node in this state help? Btw why do you
> think this issue may not be related to zookeeper?
> 
> 
> 
> On Mon, Aug 26, 2019 at 4:56 PM Andor Molnar <andor@apache.org> wrote:
> 
>> Hi Debraj,
>> 
>> The fix should be in all 3.4 versions from 3.4.6 onward, including 3.4.13.
>> Can you see ‘updatingEpoch’ file in /var/lib/zookeeper/version-2 ?
>> Also what is ‘currentEpoch.tmp’ ? I’m not sure if it relates to ZooKeeper.
>> 
>> Would you please share full startup logs of the failing node?
>> 
>> Regards,
>> Andor
>> 
>> 
>> 
>> 
>>> On 2019. Aug 23., at 18:53, Debraj Manna <subharaj.manna@gmail.com>
>> wrote:
>>> 
>>> Can someone answer by below query?
>>> 
>>> I am getting confused after going through ZOOKEEPER-1653
>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1653> and
>> ZOOKEEPER-2354
>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2354> . The issues say
>> it
>>> is fixed in 3.4.6 but exists in 3.5.x. But I am seeing the issue in
>> 3.4.13
>>> also. Can someone let me know if the issue is present in 3.4.13 also?
>>> 
>>> 
>>> On Wed 21 Aug, 2019, 12:35 PM Debraj Manna, <subharaj.manna@gmail.com>
>>> wrote:
>>> 
>>>> With the other two zookeeper servers running I stopped the zookeeper in
>>>> the broken node and the deleted all the contents inside
>> /var/lib/zookeeper/version-2
>>>> and started the zookeeper back on the node. It is running fine now and
>> got
>>>> all the data from the other servers.
>>>> 
>>>> I am getting confused after going through ZOOKEEPER-1653
>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1653> and
>> ZOOKEEPER-2354
>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2354> . The issues
say
>>>> it is fixed in 3.4.6 but exists in 3.5.x. But I am seeing the issue in
>>>> 3.4.13 also. Can someone let me know if the issue is present in 3.4.13
>> also?
>>>> 
>>>> 
>>>> 
>>>> On Wed, Aug 21, 2019 at 8:54 AM Debraj Manna <subharaj.manna@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for replying.
>>>>> 
>>>>> What is the recommended way to remove a node and delete all data from
>> it
>>>>> and make it start fresh?
>>>>> 
>>>>> On Wed 21 Aug, 2019, 12:58 AM Enrico Olivelli, <eolivelli@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hello,
>>>>>> Sorry for so late reply.
>>>>>> If you have 3 servers you can nuke the broken one and make it start
>> from
>>>>>> scratch, it will join the cluster and then recover data from the
other
>>>>>> servers
>>>>>> 
>>>>>> Try it in a staging env, not in production
>>>>>> 
>>>>>> Enrico
>>>>>> 
>>>>>> Il mar 20 ago 2019, 20:30 Debraj Manna <subharaj.manna@gmail.com>
ha
>>>>>> scritto:
>>>>>> 
>>>>>>> The same has been asked in stackoverflow
>>>>>>> <
>>>>>>> 
>>>>>> 
>> https://stackoverflow.com/questions/57574298/zookeeper-error-the-current-epoch-is-older-than-the-last-zxid
>>>>>>>> 
>>>>>>> also. But no response there also.
>>>>>>> 
>>>>>>> Anyone any thoughts on this one?
>>>>>>> 
>>>>>>> On Tue, Aug 20, 2019 at 4:43 PM Debraj Manna <
>> subharaj.manna@gmail.com
>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Posted wrong Jira link. I meant
>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2354.  Can
someone
>>>>>> let
>>>>>>> me
>>>>>>>> know what is the recommended way to recover the node?
>>>>>>>> 
>>>>>>>> support@platform2:/var/lib/zookeeper/version-2$ sudo cat
>>>>>> acceptedEpoch
>>>>>>>> 8support@platform2:/var/lib/zookeeper/version-2$ sudo cat
>>>>>> currentEpoch
>>>>>>>> 7support@platform2:/var/lib/zookeeper/version-2$ sudo cat
>>>>>>> currentEpoch.tmp
>>>>>>>> 8support@platform2
>>>>>>>> 
>>>>>>>> On Tue, Aug 20, 2019 at 3:14 PM Debraj Manna <
>>>>>> subharaj.manna@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi
>>>>>>>>> 
>>>>>>>>> I am using a zookeeper ensemble of 3 nodes running 3.4.13.
>> Sometimes
>>>>>>>>> after reboot of machine zookeeper is not starting and
I am seeing
>>>>>> the
>>>>>>> below
>>>>>>>>> errors in logs.
>>>>>>>>> 
>>>>>>>>> I have seen https://issues.apache.org/jira/browse/ZOOKEEPER-1653
.
>>>>>> Can
>>>>>>>>> someone let me if this is fixed in 3.4.13 or not as I
can see the
>>>>>> issue
>>>>>>>>> still open? Also can somone suggest what is the recommended
way to
>>>>>>> recover
>>>>>>>>> the set-up ?
>>>>>>>>> 
>>>>>>>>> 2019-08-19 04:18:36,906 [myid:2] - ERROR [main:QuorumPeer@692]
-
>>>>>> Unable
>>>>>>>>> to load database on disk
>>>>>>>>> java.io.IOException: The current epoch, 7, is older than
the last
>>>>>> zxid,
>>>>>>>>> 34359738370
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:674)
>>>>>>>>> at
>>>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:635)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
>>>>>>>>> 2019-08-19 04:18:36,908 [myid:2] - ERROR [main:QuorumPeerMain@92]
>> -
>>>>>>>>> Unexpected exception, exiting abnormally
>>>>>>>>> java.lang.RuntimeException: Unable to run quorum server
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:693)
>>>>>>>>> at
>>>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:635)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
>>>>>>>>> Caused by: java.io.IOException: The current epoch, 7,
is older than
>>>>>> the
>>>>>>>>> last zxid, 34359738370
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:674)
>>>>>>>>> ... 4 more----
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>> 
>> 


Mime
View raw message