zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debraj Manna <subharaj.ma...@gmail.com>
Subject Re: The current epoch, 7, is older than the last zxid, 8589935882
Date Fri, 30 Aug 2019 12:57:57 GMT
Is there any issue with zookeeper 3.4.13?

On Thu, Aug 29, 2019 at 10:13 AM Andor Molnar <andor@apache.org> wrote:

> Thanks for the info, I’m still looking.
> So, this is an Ubuntu packaged version of ZooKeeper.
>
> Andor
>
>
>
> > On 2019. Aug 27., at 14:13, Debraj Manna <subharaj.manna@gmail.com>
> wrote:
> >
> > No I don't see the updatingEpoch file in /var/lib/zookeeper/version-2
> >
> > I started zookeeper by adding set -x in /usr/bin/zookeeper-server I can
> see
> > zookeeper is getting started with 3.4.13 as shown below . The complete
> logs
> > are placed in the below gist
> >
> > https://gist.github.com/debraj-manna/509ec3d497016c4a249ee2b8dace05d9
> >
> > nohup java -Dzookeeper.datadir.autocreate=false
> > -Dzookeeper.log.dir=/var/log/zookeeper
> > -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp
> >
> '/usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.7.5.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.7.5.jar:/usr/lib/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/lib/zookeeper/bin/../lib/jline-2.11.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.13.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/etc/zookeeper/conf::/etc/zookeeper/conf:/usr/lib/zookeeper/*:/usr/lib/zookeeper/lib/*'
> > -Dzookeeper.log.threshold=INFO -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote.local.only=false
> > org.apache.zookeeper.server.quorum.QuorumPeerMain
> > /etc/zookeeper/conf/zoo.cfg
> > + sleep 1
> > + echo STARTED
> > STARTED
> >
> > The content of zookeeper.log is placed in the below gist after the start
> >
> > https://gist.github.com/debraj-manna/9800c5bef32837c62bdfb324c0589ad6
> >
> > Let me know if you need any more logs.
> >
> > On Mon, Aug 26, 2019 at 9:21 PM Andor Molnar <andor@apache.org> wrote:
> >
> >> I confirmed that the fix is included in 3.4.13. That’s why I asked if
> you
> >> can see ‘updatingEpoch’ file in the data folder.
> >>
> >> I don’t think the issue is not related, but I want to make sure that
> >> you’re running the right version by verifying the beginning of ZK logs.
> >>
> >> Andor
> >>
> >>
> >>
> >>> On 2019. Aug 26., at 13:43, Debraj Manna <subharaj.manna@gmail.com>
> >> wrote:
> >>>
> >>> Below is the content of currentEpoch.tmp
> >>>
> >>> support@platform2:/var/lib/zookeeper/version-2$ sudo cat acceptedEpoch
> >>> 8support@platform2:/var/lib/zookeeper/version-2$ sudo cat currentEpoch
> >>> 7support@platform2:/var/lib/zookeeper/version-2$ sudo cat
> >> currentEpoch.tmp
> >>> 8support@platform2
> >>>
> >>> Starting zookeeper logs are rolled over as the issue was there for some
> >>> time. Will the current log with the node in this state help? Btw why do
> >> you
> >>> think this issue may not be related to zookeeper?
> >>>
> >>>
> >>>
> >>> On Mon, Aug 26, 2019 at 4:56 PM Andor Molnar <andor@apache.org> wrote:
> >>>
> >>>> Hi Debraj,
> >>>>
> >>>> The fix should be in all 3.4 versions from 3.4.6 onward, including
> >> 3.4.13.
> >>>> Can you see ‘updatingEpoch’ file in /var/lib/zookeeper/version-2
?
> >>>> Also what is ‘currentEpoch.tmp’ ? I’m not sure if it relates to
> >> ZooKeeper.
> >>>>
> >>>> Would you please share full startup logs of the failing node?
> >>>>
> >>>> Regards,
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> On 2019. Aug 23., at 18:53, Debraj Manna <subharaj.manna@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Can someone answer by below query?
> >>>>>
> >>>>> I am getting confused after going through ZOOKEEPER-1653
> >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1653> and
> >>>> ZOOKEEPER-2354
> >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2354> . The
issues
> >> say
> >>>> it
> >>>>> is fixed in 3.4.6 but exists in 3.5.x. But I am seeing the issue
in
> >>>> 3.4.13
> >>>>> also. Can someone let me know if the issue is present in 3.4.13
also?
> >>>>>
> >>>>>
> >>>>> On Wed 21 Aug, 2019, 12:35 PM Debraj Manna, <
> subharaj.manna@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> With the other two zookeeper servers running I stopped the zookeeper
> >> in
> >>>>>> the broken node and the deleted all the contents inside
> >>>> /var/lib/zookeeper/version-2
> >>>>>> and started the zookeeper back on the node. It is running fine
now
> and
> >>>> got
> >>>>>> all the data from the other servers.
> >>>>>>
> >>>>>> I am getting confused after going through ZOOKEEPER-1653
> >>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1653>
and
> >>>> ZOOKEEPER-2354
> >>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2354>
. The issues
> >> say
> >>>>>> it is fixed in 3.4.6 but exists in 3.5.x. But I am seeing the
issue
> in
> >>>>>> 3.4.13 also. Can someone let me know if the issue is present
in
> 3.4.13
> >>>> also?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Aug 21, 2019 at 8:54 AM Debraj Manna <
> >> subharaj.manna@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks for replying.
> >>>>>>>
> >>>>>>> What is the recommended way to remove a node and delete
all data
> from
> >>>> it
> >>>>>>> and make it start fresh?
> >>>>>>>
> >>>>>>> On Wed 21 Aug, 2019, 12:58 AM Enrico Olivelli, <
> eolivelli@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hello,
> >>>>>>>> Sorry for so late reply.
> >>>>>>>> If you have 3 servers you can nuke the broken one and
make it
> start
> >>>> from
> >>>>>>>> scratch, it will join the cluster and then recover data
from the
> >> other
> >>>>>>>> servers
> >>>>>>>>
> >>>>>>>> Try it in a staging env, not in production
> >>>>>>>>
> >>>>>>>> Enrico
> >>>>>>>>
> >>>>>>>> Il mar 20 ago 2019, 20:30 Debraj Manna <subharaj.manna@gmail.com>
> >> ha
> >>>>>>>> scritto:
> >>>>>>>>
> >>>>>>>>> The same has been asked in stackoverflow
> >>>>>>>>> <
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> https://stackoverflow.com/questions/57574298/zookeeper-error-the-current-epoch-is-older-than-the-last-zxid
> >>>>>>>>>>
> >>>>>>>>> also. But no response there also.
> >>>>>>>>>
> >>>>>>>>> Anyone any thoughts on this one?
> >>>>>>>>>
> >>>>>>>>> On Tue, Aug 20, 2019 at 4:43 PM Debraj Manna <
> >>>> subharaj.manna@gmail.com
> >>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Posted wrong Jira link. I meant
> >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2354.
 Can
> >> someone
> >>>>>>>> let
> >>>>>>>>> me
> >>>>>>>>>> know what is the recommended way to recover
the node?
> >>>>>>>>>>
> >>>>>>>>>> support@platform2:/var/lib/zookeeper/version-2$
sudo cat
> >>>>>>>> acceptedEpoch
> >>>>>>>>>> 8support@platform2:/var/lib/zookeeper/version-2$
sudo cat
> >>>>>>>> currentEpoch
> >>>>>>>>>> 7support@platform2:/var/lib/zookeeper/version-2$
sudo cat
> >>>>>>>>> currentEpoch.tmp
> >>>>>>>>>> 8support@platform2
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Aug 20, 2019 at 3:14 PM Debraj Manna
<
> >>>>>>>> subharaj.manna@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi
> >>>>>>>>>>>
> >>>>>>>>>>> I am using a zookeeper ensemble of 3 nodes
running 3.4.13.
> >>>> Sometimes
> >>>>>>>>>>> after reboot of machine zookeeper is not
starting and I am
> seeing
> >>>>>>>> the
> >>>>>>>>> below
> >>>>>>>>>>> errors in logs.
> >>>>>>>>>>>
> >>>>>>>>>>> I have seen
> https://issues.apache.org/jira/browse/ZOOKEEPER-1653
> >> .
> >>>>>>>> Can
> >>>>>>>>>>> someone let me if this is fixed in 3.4.13
or not as I can see
> the
> >>>>>>>> issue
> >>>>>>>>>>> still open? Also can somone suggest what
is the recommended way
> >> to
> >>>>>>>>> recover
> >>>>>>>>>>> the set-up ?
> >>>>>>>>>>>
> >>>>>>>>>>> 2019-08-19 04:18:36,906 [myid:2] - ERROR
[main:QuorumPeer@692]
> -
> >>>>>>>> Unable
> >>>>>>>>>>> to load database on disk
> >>>>>>>>>>> java.io.IOException: The current epoch,
7, is older than the
> last
> >>>>>>>> zxid,
> >>>>>>>>>>> 34359738370
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:674)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>
> >>>>
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:635)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
> >>>>>>>>>>> 2019-08-19 04:18:36,908 [myid:2] - ERROR
> [main:QuorumPeerMain@92
> >> ]
> >>>> -
> >>>>>>>>>>> Unexpected exception, exiting abnormally
> >>>>>>>>>>> java.lang.RuntimeException: Unable to run
quorum server
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:693)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>
> >>>>
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:635)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
> >>>>>>>>>>> Caused by: java.io.IOException: The current
epoch, 7, is older
> >> than
> >>>>>>>> the
> >>>>>>>>>>> last zxid, 34359738370
> >>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>
> >>
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:674)
> >>>>>>>>>>> ... 4 more----
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message