zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Han <h...@apache.org>
Subject Re: Issue migrating from Zookeeper 3.4.14 to 3.5.5
Date Mon, 29 Jul 2019 21:25:42 GMT
>> java.io.IOException: No snapshot found, but there are log entries.
Something is broken!

This is expected behavior introduced in ZOOKEEPER-2325. We don't want to
end up with potential inconsistent state across the ensemble when
recovering from empty snapshot.

To continue upgrade, just delete all txn log files and let the node sync
the snapshot from the quorum.


On Mon, Jul 29, 2019 at 1:38 PM Enrico Olivelli <eolivelli@gmail.com> wrote:

> Il lun 29 lug 2019, 22:32 Jörn Franke <jornfranke@gmail.com> ha scritto:
>
> > It also seems that 3.5.5 does not attempt to read all of the logfiles (I
> > have to still confirm), but the two it reads exist, it has access and
> they
> > are much more than 0 byte
> >
>
> We should have the stackstace of the EOFException.
>
> Anyone on this list has a better idea?
>
> Enrico
>
>
> > On Mon, Jul 29, 2019 at 10:13 PM Jörn Franke <jornfranke@gmail.com>
> wrote:
> >
> > > (of course i do not run them at the same time)
> > >
> > > On Mon, Jul 29, 2019 at 10:10 PM Jörn Franke <jornfranke@gmail.com>
> > wrote:
> > >
> > >> thank you for the quick reply. They read from the same disk paths and
> > >> have the same access rights (in fact the RHEL service executes them as
> > the
> > >> same specific user).
> > >>
> > >> On Mon, Jul 29, 2019 at 10:09 PM Enrico Olivelli <eolivelli@gmail.com
> >
> > >> wrote:
> > >>
> > >>> Il lun 29 lug 2019, 21:50 Jörn Franke <jornfranke@gmail.com>
ha
> > scritto:
> > >>>
> > >>> > Hi,
> > >>> >
> > >>> > I tried to migrate a lab environment from Zookeepr 3.4.14 (used
for
> > >>> Solr)
> > >>> > to 3.5.5 and encountered an issue. It is ZooKeeper in standalone
> mode
> > >>> > (other environments have a proper ensemble). I increased
> > jute.maxbuffer
> > >>> > beyond the default (but not excessively) - this was working
> perfectly
> > >>> fine
> > >>> > in 3.4.14.
> > >>> >
> > >>> > Basically I reuse for the migration the same config files, except
> > that
> > >>> I
> > >>> > whitelist some commands (later I am also interested in adding
SSL).
> > >>> >
> > >>> > I have the following error message when starting Zookeeper with
> 3.5.5
> > >>> > (basically, I just changed the symboling link from zookeeper to
> point
> > >>> to
> > >>> > 3.5.5 instead of the 3.4.14 directory:
> > >>> > 2019-07-29 15:16:25,217 [myid:] - DEBUG
> > >>> > [main:FileTxnLog$FileTxnIterator@655]
> > >>> > - Created new input stream /zookeeper/version-2/log.b34
> > >>> > 2019-07-29 15:16:25,217 [myid:] - DEBUG
> > >>> > [main:FileTxnLog$FileTxnIterator@658]
> > >>> > - Created new input archive /zookeeper/version-2/log.b34
> > >>> > 2019-07-29 15:16:25,222 [myid:] - DEBUG
> > >>> > [main:FileTxnLog$FileTxnIterator@696]
> > >>> > - EOF exception java.io.EOFException: Failed to read
> > >>> > /zookeeper/version-2/log.b34
> > >>> > 2019-07-29 15:16:25,223 [myid:] - DEBUG
> > >>> > [main:FileTxnLog$FileTxnIterator@655]
> > >>> > - Created new input stream /zookeeper/version-2/log.b72
> > >>> > 2019-07-29 15:16:25,223 [myid:] - DEBUG
> > >>> > [main:FileTxnLog$FileTxnIterator@658]
> > >>> > - Created new input archive /zookeeper/version-2/log.b72
> > >>> > 2019-07-29 15:16:25,224 [myid:] - DEBUG
> > >>> > [main:FileTxnLog$FileTxnIterator@696]
> > >>> > - EOF exception java.io.EOFException: Failed to read
> > >>> > /zookeeper/version-2/log.b72
> > >>> > 2019-07-29 15:16:25,224 [myid:] - ERROR
> [main:ZooKeeperServerMain@83
> > ]
> > >>> -
> > >>> > Unexpected exception, exiting abnormally
> > >>> > java.io.IOException: No snapshot found, but there are log entries.
> > >>> > Something is broken!
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:211)
> > >>> >         at
> > >>> >
> > >>>
> > org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:290)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:450)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:764)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.ServerCnxnFactory.startup(ServerCnxnFactory.java:98)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:144)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:106)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:64)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:128)
> > >>> >         at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
> > >>> >
> > >>> > Strangely enough, if I switch back to 3.4.14 the issue is resolved
> > and
> > >>> > Zookeeper works normally. However, I would like to leverage the
new
> > >>> version
> > >>> > 3.5.5.
> > >>> >
> > >>> > There are no 0 bytes files. Disk space is plenty available.
> > >>> >
> > >>>
> > >>>
> > >>> Can you compare these logs with  logs of 3.4.x ? Are they reading
> from
> > >>> the
> > >>> same disk paths?
> > >>>
> > >>>
> > >>>
> > >>> > Any idea beyond erasing the data dir (I would try to avoid it,
I
> can
> > >>> > reconstruct it, but still)?  I will try also in the other
> > environments
> > >>> and
> > >>> > also with an environment with an ensemble, but i would like to
know
> > >>> before
> > >>> > what the issue could be.
> > >>> >
> > >>> > Not sure if it is relevant, but:
> > >>> > Activated Kerberos Authentication and Kerberos SSL for clients
and
> > >>> quorum.
> > >>> >
> > >>>
> > >>> Quorum? In standalone mode there is no 'quorum' auth
> > >>>
> > >>> Enrico
> > >>>
> > >>> >
> > >>>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message