Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EA3D8114D2 for ; Tue, 16 Sep 2014 08:42:06 +0000 (UTC) Received: (qmail 33498 invoked by uid 500); 16 Sep 2014 08:42:06 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 33455 invoked by uid 500); 16 Sep 2014 08:42:06 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 33443 invoked by uid 99); 16 Sep 2014 08:42:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Sep 2014 08:42:05 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lalit.j.jangra@gmail.com designates 209.85.213.44 as permitted sender) Received: from [209.85.213.44] (HELO mail-yh0-f44.google.com) (209.85.213.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Sep 2014 08:42:01 +0000 Received: by mail-yh0-f44.google.com with SMTP id b6so1948441yha.3 for ; Tue, 16 Sep 2014 01:41:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=nrQqV6vhhq7PL4bTtiCrybuRHP85s3mnt9WqieudV2o=; b=eWZtkZjlq9xwNcAXkfpUzKCpNAsuOoHU28VvLfZRD68x1Karqr/ZQLECEtaRq1LV1v fHDgEHs5k3xuc+Zu/9iMZPrakjAGg61xMk85hl38kajXFYK1bN0lpOn/QLjtIWTjUxtS SAU4+nuXZ1eOf2H+FoeU7bIeM/zWZ7yygcPIFJ0MIc01ezj4Qyg7Ei+FWV/lry+F4+BY 45UvMXElE2zROWj3YL7nmtx0hCYoJOZEEnB7rf6Nh5GGuNAhzdNTVSX/QDCU9aKflPpb VJ/aUxrVLTVeeoKUtRPa1/zx7Ky8veiKKRqkJM/HdMiDvlmfnJ7X3yqPKTOiJbie03Fl uxEg== MIME-Version: 1.0 X-Received: by 10.236.189.163 with SMTP id c23mr205975yhn.128.1410856900807; Tue, 16 Sep 2014 01:41:40 -0700 (PDT) Received: by 10.170.120.209 with HTTP; Tue, 16 Sep 2014 01:41:40 -0700 (PDT) In-Reply-To: <1410856538.71906.YahooMailNeo@web142301.mail.bf1.yahoo.com> References: <1410789552.20480.YahooMailNeo@web142305.mail.bf1.yahoo.com> <1410821777.72786.YahooMailNeo@web142305.mail.bf1.yahoo.com> <1410856538.71906.YahooMailNeo@web142301.mail.bf1.yahoo.com> Date: Tue, 16 Sep 2014 14:11:40 +0530 Message-ID: Subject: Re: Getting errors in zookeeper logs From: lalit jangra To: user@zookeeper.apache.org, Flavio Junqueira Content-Type: multipart/alternative; boundary=089e0160af9666d10305032ab82c X-Virus-Checked: Checked by ClamAV on apache.org --089e0160af9666d10305032ab82c Content-Type: text/plain; charset=UTF-8 Thanks Flavio, I will try and update. Can you confirm if i add java.env under conf folder with JVM settings as "-Xms 1024m -Xmx1024m" , it will help to limit memory size of zookeeper till 1 G only? Regards. On Tue, Sep 16, 2014 at 2:05 PM, Flavio Junqueira < fpjunqueira@yahoo.com.invalid> wrote: > What if you use 'zkServer.sh start-foreground' to debug? > > -Flavio > > > On Tuesday, September 16, 2014 5:20 AM, lalit jangra < > lalit.j.jangra@gmail.com> wrote: > > > > > > > >Hello Flavio, > > > >I am using 'zkServer.sh start' command to start zookeeper nodes. I also > >could see logs in log folders in have specified but these logs are in a > >form which is difficult to understand. > > > >Also regarding to using 6 zookeeper nodes (3+3), is it fine to handle > >failures as per 50% rule as if 3 are down my cluster should work or should > >i move to having odd numbers such as 5 or 7 here? > > > >Regards. > > > >On Tue, Sep 16, 2014 at 4:26 AM, Flavio Junqueira < > >fpjunqueira@yahoo.com.invalid> wrote: > > > >> Instead of guessing, I think it is best if we understand what's going > >> wrong with the servers, you need to look at the server logs. If you > don't > >> know how to get it, could you please share the command you're using to > >> start servers? > >> > >> -Flavio > >> > >> > >> > >> On Monday, September 15, 2014 3:30 PM, lalit jangra < > >> lalit.j.jangra@gmail.com> wrote: > >> > >> > >> > > >> > > >> >Hello Flavio, > >> > > >> >Can this issue arise from system not having enough RAM for Java Heap > as i > >> >could see my system is running on top of its RAM? > >> > > >> >Also is there any way to assign memory to zookeeper nodes? > >> > > >> >Regards. > >> > > >> >On Mon, Sep 15, 2014 at 7:37 PM, lalit jangra < > lalit.j.jangra@gmail.com> > >> >wrote: > >> > > >> >> Thanks Flavio, > >> >> > >> >> I am having 3+3 zookeeper nodes on two servers MCF1 & MCF2. Also i > could > >> >> see same error on both nodes. For logs into servers, i am not able to > >> read > >> >> anything from these, how can i read and interpret from zookeeper > servers > >> >> what is wrong? > >> >> > >> >> I have put different log & data directories for each of zookeeper, > may > >> be > >> >> i should elaborate a bit more. I am deciding on names of logs & data > >> >> directory as per myid (ranging from 1 to 6). > >> >> > >> >> ZK1 -> Data.1 -> Logs.1 > >> >> ZK2 -> Data.2 -> Logs.2 > >> >> ZK3 -> Data.3 -> Logs.3 > >> >> ZK4 -> Data.4 -> Logs.4 > >> >> ZK5 -> Data.5 -> Logs.5 > >> >> ZK6 -> Data.6 -> Logs.6 > >> >> > >> >> As i have two servers only and i need to make it running on these two > >> only > >> >> so i chose this architecture. Also i am trying to make even for > scenario > >> >> where one node is down, i have only 3 zookeepers down so still > second is > >> >> working. If i have odd numbers say 5 or 7, if server with more > numbers > >> of > >> >> zookeeper is down, its gone. > >> >> > >> >> Regards. > >> >> > >> >> > >> >> On Mon, Sep 15, 2014 at 7:29 PM, Flavio Junqueira < > >> >> fpjunqueira@yahoo.com.invalid> wrote: > >> >> > >> >>> I believe you have shared just the client-side errors, and I was > >> >>> wondering what's going on with the servers. One problem I could spot > >> with > >> >>> the configuration is with the values of dataDir and dataLogDir. It > >> looks > >> >>> like the processes on the same node are writing to the same > directory, > >> >>> which should be confusing the servers. > >> >>> > >> >>> A couple of things about your setting. I'm not sure what your > >> motivation > >> >>> is to put multiple servers on the same node. It will induce > correlated > >> >>> crashes for the servers on the same node. Also, we in general > >> recommend to > >> >>> use an odd number of servers (5 or 7 for your case). > >> >>> > >> >>> -Flavio > >> >>> > >> >>> On Wednesday, September 10, 2014 6:29 AM, lalit jangra < > >> >>> lalit.j.jangra@gmail.com> wrote: > >> >>> > >> >>> > >> >>> > > >> >>> > > >> >>> >Hi, > >> >>> > > >> >>> >I am running cluster of two Apache ManifoldCF nodes on two separate > >> >>> >machines each of which having 3 zookeeper instances (total 6 > >> instances in > >> >>> >cluster). When i am running up manifoldCF agents, i see below > warning > >> >>> >during startup. > >> >>> > > >> >>> >[http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] > INFO > >> >>> >org.apache.zookeeper.ClientCnxn - Unable to read additional data > from > >> >>> >server sessionid 0x0, likely server has closed socket, closing > socket > >> >>> >connection and attempting reconnect > >> >>> > > >> >>> >[http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] > INFO > >> >>> >org.apache.zookeeper.ClientCnxn - Opening socket connection to > server > >> >>> >iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to > >> >>> >authenticate using SASL (unknown error) > >> >>> > > >> >>> > > >> >>> >Also i could see below error in logs in while agents are running. > >> >>> > > >> >>> >[localhost-startStop-1-SendThread(iwdc1preecma03.iwater.ie:2183)] > >> WARN > >> >>> >org.apache.zookeeper.ClientCnxn - Session 0x6485a8006060079 for > server > >> >>> >iwdc1preecma03.iwater.ie/10.231.72.24:2183, unexpected error, > closing > >> >>> >socket connection and attempting reconnect > >> >>> > > >> >>> >java.io.IOException: Connection reset by peer > >> >>> > > >> >>> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > >> >>> > > >> >>> > at > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > >> >>> > > >> >>> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) > >> >>> > > >> >>> > at sun.nio.ch.IOUtil.read(IOUtil.java:193) > >> >>> > > >> >>> > at > >> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375) > >> >>> > > >> >>> > at > >> >>> > >> >>> > >> > >org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) > >> >>> > > >> >>> > at > >> >>> > >> >>> > >> > >org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) > >> >>> > > >> >>> > at > >> >>> > >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > >> >>> > > >> >>> > > >> >>> >Below are configurations for 1. zookeeper nodes & 2. MCF nodes for > >> >>> >zookeeper. > >> >>> > > >> >>> > > >> >>> >*zoo.cfg : Same for all six zookeeper nodes.* > >> >>> > > >> >>> > > >> >>> ># The number of milliseconds of each tick > >> >>> > > >> >>> >tickTime=2000 > >> >>> > > >> >>> >dataDir=/app/IW/zookeeper/data/data.1 > >> >>> > > >> >>> >dataLogDir=/app/IW/zookeeper/logs/log.1 > >> >>> > > >> >>> >clientPort=2181 > >> >>> > > >> >>> >server.1=iwdc1preecma03:2888:3888 > >> >>> > > >> >>> >server.2=iwdc1preecma03:2889:3889 > >> >>> > > >> >>> >server.3=iwdc1preecma03:2890:3890 > >> >>> > > >> >>> >server.4=iwdc2preecma04:2891:3891 > >> >>> > > >> >>> >server.5=iwdc2preecma04:2892:3892 > >> >>> > > >> >>> >server.6=iwdc2preecma04:2893:3893 > >> >>> > > >> >>> ># The number of ticks that the initial > >> >>> > > >> >>> ># synchronization phase can take > >> >>> > > >> >>> >initLimit=10 > >> >>> > > >> >>> ># The number of ticks that can pass between > >> >>> > > >> >>> ># sending a request and getting an acknowledgement > >> >>> > > >> >>> >syncLimit=5 > >> >>> > > >> >>> ># the directory where the snapshot is stored. > >> >>> > > >> >>> ># do not use /tmp for storage, /tmp here is just > >> >>> > > >> >>> ># example sakes. > >> >>> > > >> >>> >#dataDir=/tmp/zookeeper > >> >>> > > >> >>> ># the port at which the clients will connect > >> >>> > > >> >>> >#clientPort=2181 > >> >>> > > >> >>> ># the maximum number of client connections. > >> >>> > > >> >>> ># increase this if you need to handle more clients > >> >>> > > >> >>> >#maxClientCnxns=60 > >> >>> > > >> >>> ># > >> >>> > > >> >>> ># Be sure to read the maintenance section of the > >> >>> > > >> >>> ># administrator guide before turning on autopurge. > >> >>> > > >> >>> ># > >> >>> > > >> >>> ># > >> >>> > >> > http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance > >> >>> > > >> >>> ># > >> >>> > > >> >>> ># The number of snapshots to retain in dataDir > >> >>> > > >> >>> >autopurge.snapRetainCount=3 > >> >>> > > >> >>> ># Purge task interval in hours > >> >>> > > >> >>> ># Set to "0" to disable auto purge feature > >> >>> > > >> >>> >autopurge.purgeInterval=1 > >> >>> > > >> >>> > > >> >>> > > >> >>> >*ManifoldCF configurations : same for both ManifoldCF nodes.* > >> >>> > > >> >>> > > >> >>> > >> >>> > >value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/> > >> >>> > > >> >>> > >> >>> > >> >>> > >> > >value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/> > >> >>> > > >> >>> > >> >>> >value="4000"/> > >> >>> > > >> >>> > > >> >>> > > >> >>> >*I want to know if due to above warnings/errors, will zookeeper > stop > >> >>> >working or will zookeeper will work and these are non-failing > >> messages, > >> >>> >because ManifoldCF jobs are stuck while i can see these errors.* > >> >>> > > >> >>> >Please suggest. > >> >>> > > >> >>> >Regards, > >> >>> >Lalit. > > > >> > > >> >>> > > >> >>> > > >> >>> > > >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> Regards, > >> >> Lalit. > >> >> > >> > > >> > > >> > > >> >-- > >> >Regards, > >> >Lalit. > >> > > >> > > >> > > >> > > > > > > > >-- > >Regards, > >Lalit. > > > > > > > -- Regards, Lalit. --089e0160af9666d10305032ab82c--