Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4830111921 for ; Sun, 6 Jul 2014 23:39:54 +0000 (UTC) Received: (qmail 6781 invoked by uid 500); 6 Jul 2014 23:39:53 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 6750 invoked by uid 500); 6 Jul 2014 23:39:53 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 6738 invoked by uid 99); 6 Jul 2014 23:39:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Jul 2014 23:39:52 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: unknown (athena.apache.org: error in processing during lookup of azimmerman@sproutsocial.com) Received: from [209.85.219.47] (HELO mail-oa0-f47.google.com) (209.85.219.47) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Jul 2014 23:39:48 +0000 Received: by mail-oa0-f47.google.com with SMTP id n16so3747347oag.20 for ; Sun, 06 Jul 2014 16:39:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sproutsocial.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=XBs7QJVcaoEV7Hh1rBA2bLXIcR7lug8kRpk+G370yro=; b=F7v5OtHdxBWAcrTxti6WtFB4d9uqvX2N7k+MPByiFxKMhg+q0v0DeHReB34d2ecQZf +5mKhUOEJK09zCq3NJUF96Itqe5AkGzPZfq4/etvPLhqtJEVymgj1OWherWFEZfgNaHW XSB1cr9LFQG16HpJM40imeUaM7GR+TXjAo8zU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=XBs7QJVcaoEV7Hh1rBA2bLXIcR7lug8kRpk+G370yro=; b=K7EJCQeFnCuVOp/LXC1f9BoUqrN6N2oW8tVUJOIEIXF/8r4IVQLyBRiSN+6z5FSZoA MDBzze1jqdRMutXwm/3zCQ6i8eQQs/8pVWni8PvodP7TBFKRPZ14BCMxf+NcMTpp06VB DjxTfUUQ5opjdOi+cbUaXDBG1h0oehhwAzB+GEBZLdgtttiBYEC+lrjfP/vlRRO5AYrq MOjgwuJ0lQmC5x/r2U0yBkDCRczNwbgFuqaokBn/Hxt7LnryaPWmEBSQ+jVidoEjBZ6r 1FH2bkyduGGA3Gd/DHb7Dk/xWeJS6Fbo1rIJqyvBuEFgMuT+611prK+mqkyE6IRTzW2F QicQ== X-Gm-Message-State: ALoCoQluDjEcwqSP9K4ZlDnQWTxLtbqSeeUYALTLTacLFLPxBaJEhLUPApnh/Pd6fM72U3qL/CmA MIME-Version: 1.0 X-Received: by 10.60.35.104 with SMTP id g8mr27720868oej.41.1404689965777; Sun, 06 Jul 2014 16:39:25 -0700 (PDT) Received: by 10.183.24.161 with HTTP; Sun, 6 Jul 2014 16:39:25 -0700 (PDT) In-Reply-To: References: <8C226235-F318-4D7C-9E58-A902111E53A5@yahoo.com> <6033F6AC-4BFF-41D7-8980-EC5563961BE3@yahoo.com> Date: Sun, 6 Jul 2014 18:39:25 -0500 Message-ID: Subject: Re: entire cluster dies with EOFException From: Aaron Zimmerman To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary=089e013a1cfe6e25fb04fd8ede81 X-Virus-Checked: Checked by ClamAV on apache.org --089e013a1cfe6e25fb04fd8ede81 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I should also mention that these servers only have 1G of ram. Zookeeper usually takes around 300m. On Sun, Jul 6, 2014 at 6:32 PM, Aaron Zimmerman wrote: > This happened again this afternoon. I captured the log file as well as > the preceding snapshot, but I can't upload to jira obviously, as its 10G = or > so. Is there an easy way to step through it and see what the last, > presumably corrupted line is? I'll try to get zookeeper running in debug > mode or something. > > I think reducing the snap count seems like a good idea, having just read > about this setting. What has me most worried is that this issue somehow > propagates through the entire cluster. Is there some way to gauge these > concerns? Is there a rule of thumb like "log files should be less than 5= g" > or something that we can watch for to head off this problem before it > becomes a production issue? > > But snap count seems like memory defense, would that cause EOF exceptions > when reading a DataInputStream? > > This is the config file, it looks a bit odd since its auto-generated by > exhibitor, but here it is verbatim. I was just reading about having > separate devices for dataLogDir and dataDir. That is definitely somethin= g > I plan to put in place. > > server.3=3Dzookeeper02-n3\:2888\:3888 > server.2=3Dzookeeper02-n2\:2888\:3888 > server.1=3Dzookeeper02-n1\:2888\:3888 > initLimit=3D10 > syncLimit=3D5 > clientPort=3D2181 > tickTime=3D2000 > dataDir=3D/var/lib/zookeeper > dataLogDir=3D/var/lib/zookeeper > server.5=3Dzookeeper02-n5\:2888\:3888 > server.4=3Dzookeeper02-n4\:2888\:3888 > > > On Sun, Jul 6, 2014 at 5:30 PM, Flavio Junqueira < > fpjunqueira@yahoo.com.invalid> wrote: > >> That's a good point, some znodes in the snapshot Aaron shared have over >> 400k bytes. There are some znodes with tens of kbytes too. >> >> Given that we are giving hints about the config parameters, it'd be nice >> if we could see the config file. >> >> -Flavio >> >> On 06 Jul 2014, at 22:34, Benjamin Reed wrote: >> >> > the log files seem very large to me. is the snap count really large? o= r >> do >> > you have a lot of data in each znode? you might try reducing the snap >> count >> > to make the log files smaller. >> > On Jul 6, 2014 2:27 PM, "Flavio Junqueira" >> >> > wrote: >> > >> >> But what is it that was causing problems in your scenario, Raul? Is i= t >> >> reading the log? In any case, it sounds like initLimit is the >> parameter you >> >> want to change, no? >> >> >> >> -Flavio >> >> >> >> On 06 Jul 2014, at 19:09, Ra=C3=BAl Guti=C3=A9rrez Segal=C3=A9s >> >> wrote: >> >> >> >>> Oh, storm right. Yeah I've seen this. The transaction rate is so hug= e >> the >> >>> the initial sync fails.. perhaps you could try bigger tickTime, >> initLimit >> >>> and syncLimit params... >> >>> >> >>> >> >>> -rgs >> >>> >> >>> >> >>> On 6 July 2014 04:48, Aaron Zimmerman >> >> wrote: >> >>> >> >>>> Ra=C3=BAl, >> >>>> >> >>>> zk_approximate_data_size 4899392 >> >>>> >> >>>> That is about the size of the snapshots also. >> >>>> >> >>>> Benjamin, >> >>>> >> >>>> We are not running out of disk space. >> >>>> But the log.XXXX files are quite large, is this normal? In less >> than 3 >> >>>> hours, the log file since the last snapshot is 8.2G, and the older >> log >> >>>> files are as large as 12G. >> >>>> >> >>>> We are using Storm Trident, this uses zookeeper pretty heavily for >> >> tracking >> >>>> transactional state, but i'm not sure if that could account for thi= s >> >> much >> >>>> storage. Is there an easy way to track which znodes are being >> updated >> >> most >> >>>> frequently? >> >>>> >> >>>> Thanks, >> >>>> >> >>>> Aaron >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> On Sun, Jul 6, 2014 at 1:55 AM, Ra=C3=BAl Guti=C3=A9rrez Segal=C3= =A9s < >> >>>> rgs@itevenworks.net> >> >>>> wrote: >> >>>> >> >>>>> What's the total size of the data in your ZK cluster? i.e.: >> >>>>> >> >>>>> $ echo mntr | nc localhost 2181 | grep zk_approximate_data_size >> >>>>> >> >>>>> And/or the size of the snapshot? >> >>>>> >> >>>>> >> >>>>> -rgs >> >>>>> >> >>>>> >> >>>>> On 4 July 2014 06:29, Aaron Zimmerman >> >>>> wrote: >> >>>>> >> >>>>>> Hi all, >> >>>>>> >> >>>>>> We have a 5 node zookeeper cluster that has been operating normal= ly >> >> for >> >>>>>> several months. Starting a few days ago, the entire cluster >> crashes a >> >>>>> few >> >>>>>> times per day, all nodes at the exact same time. We can't track >> down >> >>>> the >> >>>>>> exact issue, but deleting the snapshots and logs and restarting >> >>>> resolves. >> >>>>>> >> >>>>>> We are running exhibitor to monitor the cluster. >> >>>>>> >> >>>>>> It appears that something bad gets into the logs, causing an >> >>>> EOFException >> >>>>>> and this cascades through the entire cluster: >> >>>>>> >> >>>>>> 2014-07-04 12:55:26,328 [myid:1] - WARN >> >>>>>> [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Excepti= on >> >>>> when >> >>>>>> following the leader >> >>>>>> java.io.EOFException >> >>>>>> at java.io.DataInputStream.readInt(DataInputStream.java:375= ) >> >>>>>> at >> >>>>>> >> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) >> >>>>>> at >> >>>>>> >> >>>>>> >> >>>>> >> >>>> >> >> >> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket= .java:83) >> >>>>>> at >> >>>>>> >> >>>>> >> >>>> >> >> >> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:10= 8) >> >>>>>> at >> >>>>>> >> >> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:15= 2) >> >>>>>> at >> >>>>>> >> >>>>> >> >>>> >> >> >> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:8= 5) >> >>>>>> at >> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) >> >>>>>> 2014-07-04 12:55:26,328 [myid:1] - INFO >> >>>>>> [QuorumPeer[myid=3D1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdo= wn >> >>>>> called >> >>>>>> java.lang.Exception: shutdown Follower >> >>>>>> at >> >>>>>> >> >> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:16= 6) >> >>>>>> at >> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744) >> >>>>>> >> >>>>>> >> >>>>>> Then the server dies, exhibitor tries to restart each node, and >> they >> >>>> all >> >>>>>> get stuck trying to replay the bad transaction, logging things >> like: >> >>>>>> >> >>>>>> >> >>>>>> 2014-07-04 12:58:52,734 [myid:1] - INFO [main:FileSnap@83] - >> Reading >> >>>>>> snapshot /var/lib/zookeeper/version-2/snapshot.300011fc0 >> >>>>>> 2014-07-04 12:58:52,896 [myid:1] - DEBUG >> >>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream >> >>>>>> /var/lib/zookeeper/version-2/log.300000021 >> >>>>>> 2014-07-04 12:58:52,915 [myid:1] - DEBUG >> >>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive >> >>>>>> /var/lib/zookeeper/version-2/log.300000021 >> >>>>>> 2014-07-04 12:59:25,870 [myid:1] - DEBUG >> >>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton >> >>>>> java.io.EOFException: >> >>>>>> Failed to read /var/lib/zookeeper/version-2/log.300000021 >> >>>>>> 2014-07-04 12:59:25,871 [myid:1] - DEBUG >> >>>>>> [main:FileTxnLog$FileTxnIterator@575] - Created new input stream >> >>>>>> /var/lib/zookeeper/version-2/log.300011fc2 >> >>>>>> 2014-07-04 12:59:25,872 [myid:1] - DEBUG >> >>>>>> [main:FileTxnLog$FileTxnIterator@578] - Created new input archive >> >>>>>> /var/lib/zookeeper/version-2/log.300011fc2 >> >>>>>> 2014-07-04 12:59:48,722 [myid:1] - DEBUG >> >>>>>> [main:FileTxnLog$FileTxnIterator@618] - EOF excepton >> >>>>> java.io.EOFException: >> >>>>>> Failed to read /var/lib/zookeeper/version-2/log.300011fc2 >> >>>>>> >> >>>>>> And the cluster is dead. The only way we have found to recover i= s >> to >> >>>>>> delete all of the data and restart. >> >>>>>> >> >>>>>> Anyone seen this before? Any ideas how I can track down what is >> >>>> causing >> >>>>>> the EOFException, or insulate zookeeper from completely crashing? >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Aaron Zimmerman >> >>>>>> >> >>>>> >> >>>> >> >> >> >> >> >> > --089e013a1cfe6e25fb04fd8ede81--