zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "FPJ" <fpjunque...@yahoo.com.INVALID>
Subject RE: entire cluster dies with EOFException
Date Mon, 14 Jul 2014 16:18:28 GMT
Thanks for reporting back, Aaron. Shall we close the jira you created?

-Flavio

> -----Original Message-----
> From: Aaron Zimmerman [mailto:azimmerman@sproutsocial.com]
> Sent: 14 July 2014 16:21
> To: user@zookeeper.apache.org
> Subject: Re: entire cluster dies with EOFException
> 
> Closing the loop on this, It appears that upping the initLimit did resolve the
> issue.  Thanks all for the help.
> 
> Thanks,
> 
> Aaron Zimmerman
> 
> 
> On Tue, Jul 8, 2014 at 4:40 PM, Flavio Junqueira <
> fpjunqueira@yahoo.com.invalid> wrote:
> 
> > Agreed, but we need that check because we expect bytes for the
> > checksum computation right underneath. The bit that's odd is that we
> > make the same check again below:
> >
> >         try {
> >                 long crcValue = ia.readLong("crcvalue");
> >                 byte[] bytes = Util.readTxnBytes(ia);
> >                 // Since we preallocate, we define EOF to be an
> >                 if (bytes == null || bytes.length==0) {
> >                     throw new EOFException("Failed to read " + logFile);
> >                 }
> >                 // EOF or corrupted record
> >                 // validate CRC
> >                 Checksum crc = makeChecksumAlgorithm();
> >                 crc.update(bytes, 0, bytes.length);
> >                 if (crcValue != crc.getValue())
> >                     throw new IOException(CRC_ERROR);
> >                 if (bytes == null || bytes.length == 0)
> >                     return false;
> >                 hdr = new TxnHeader();
> >                 record = SerializeUtils.deserializeTxn(bytes, hdr);
> >             } catch (EOFException e) {
> >
> > I'm moving this discussion, to the jira, btw.
> >
> > -Flavio
> >
> > On 07 Jul 2014, at 22:03, Aaron Zimmerman
> > <azimmerman@sproutsocial.com>
> > wrote:
> >
> > > Flavio,
> > >
> > > Yes that is the initial error, and then the nodes in the cluster are
> > > restarted but fail to restart with
> > >
> > > 2014-07-04 12:58:52,734 [myid:1] - INFO  [main:FileSnap@83] -
> > > Reading snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0
> > > 2014-07-04 12:58:52,896 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > > /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:58:52,915 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > > /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:59:25,870 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > > Failed to read /var/lib/zookeeper/version-2/log.300000021
> > > 2014-07-04 12:59:25,871 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@575] - Created new input stream
> > > /var/lib/zookeeper/version-2/log.300011fc2
> > > 2014-07-04 12:59:25,872 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@578] - Created new input archive
> > > /var/lib/zookeeper/version-2/log.300011fc2
> > > 2014-07-04 12:59:48,722 [myid:1] - DEBUG
> > > [main:FileTxnLog$FileTxnIterator@618] - EOF excepton
> > java.io.EOFException:
> > > Failed to read /var/lib/zookeeper/version-2/log.300011fc2
> > >
> > > Thanks,
> > >
> > > AZ
> > >
> > >
> > > On Mon, Jul 7, 2014 at 3:33 PM, Flavio Junqueira <
> > > fpjunqueira@yahoo.com.invalid> wrote:
> > >
> > >> I'm a bit confused, the stack trace you reported was this one:
> > >>
> > >> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception
> > >> when following the leader java.io.EOFException
> > >>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
> > >>       at
> > >>
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> > >>       at
> > >>
> >
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPa
> ck
> > et.java:83)
> > >>       at
> > >>
> > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:
> > 108)
> > >>       at
> > >>
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> > >>       at
> > >>
> >
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java
> > :85)
> > >>       at
> > >>
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:7
> > >> 40)
> > >>
> > >>
> > >> That's in a different part of the code.
> > >>
> > >> -Flavio
> > >>
> > >> On 07 Jul 2014, at 18:50, Aaron Zimmerman
> > >> <azimmerman@sproutsocial.com>
> > >> wrote:
> > >>
> > >>> Util.readTxnBytes reads from the buffer and if the length is 0, it
> > return
> > >>> the zero length array, seemingly indicating the end of the file.
> > >>>
> > >>> Then this is detected in FileTxnLog.java:671:
> > >>>
> > >>>               byte[] bytes = Util.readTxnBytes(ia);
> > >>>               // Since we preallocate, we define EOF to be an
> > >>>               if (bytes == null || bytes.length==0) {
> > >>>                   throw new EOFException("Failed to read " + logFile);
> > >>>               }
> > >>>
> > >>>
> > >>> This exception is caught a few lines later, and the streams closed
etc.
> > >>>
> > >>> So this seems to be not really an error condition, but a signal
> > >>> that
> > the
> > >>> entire file has been read? Is this exception a red herring?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Jul 7, 2014 at 11:50 AM, Raúl Gutiérrez Segalés <
> > >> rgs@itevenworks.net
> > >>>> wrote:
> > >>>
> > >>>> On 7 July 2014 09:39, Aaron Zimmerman
> > >>>> <azimmerman@sproutsocial.com>
> > >> wrote:
> > >>>>
> > >>>>> What I don't understand is how the entire cluster could die
in
> > >>>>> such a situation.  I was able to load zookeeper locally using
> > >>>>> the snapshot
> > and
> > >>>> 10g
> > >>>>> log file without apparent issue.
> > >>>>
> > >>>>
> > >>>> Sure, but it's syncing up with other learners that becomes
> > >>>> challenging
> > >> when
> > >>>> having either big snapshots or too many txnlogs, right?
> > >>>>
> > >>>>
> > >>>>> I can see how large amounts of data could cause latency issues
> > >>>>> in syncing causing a single worker to die, but
> > how
> > >>>>> would that explain the node's inability to restart?  When the
> > >>>>> server replays the log file, does it have to sync the
> > >>>>> transactions to other
> > >>>> nodes
> > >>>>> while it does so?
> > >>>>>
> > >>>>
> > >>>> Given that your txn churn is so big, by the time it finished up
> > reading
> > >>>> from disc it'll need
> > >>>> to catch up with the quorum.. how many txns have happened by that
> > >> point? By
> > >>>> the way, we use
> > >>>> this patch:
> > >>>>
> > >>>> https://issues.apache.org/jira/browse/ZOOKEEPER-1804
> > >>>>
> > >>>> to measure transaction rate, do you have any approximation of
> > >>>> what
> > your
> > >>>> transaction rate might be?
> > >>>>
> > >>>>
> > >>>>>
> > >>>>> I can alter the settings as has been discussed, but I worry
that
> > >>>>> I'm
> > >> just
> > >>>>> delaying the same thing from happening again, if I deploy
> > >>>>> another
> > storm
> > >>>>> topology or something.  How can I get the cluster in a state
> > >>>>> where I
> > >> can
> > >>>> be
> > >>>>> confident that it won't crash in a similar way as load
> > >>>>> increases, or
> > at
> > >>>>> least set up some kind of monitoring that will let me know
> > >>>>> something
> > is
> > >>>>> unhealthy?
> > >>>>>
> > >>>>
> > >>>> I think it depends on what your txn rate is, lets measure that
> > >>>> first I guess.
> > >>>>
> > >>>>
> > >>>> -rgs
> > >>>>
> > >>
> > >>
> >
> >


Mime
View raw message