zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lalit jangra <lalit.j.jan...@gmail.com>
Subject Re: Getting errors in zookeeper logs
Date Tue, 16 Sep 2014 04:20:03 GMT
Hello Flavio,

I am using 'zkServer.sh start' command to start zookeeper nodes. I also
could see logs in log folders in have specified but these logs are in a
form which is difficult to understand.

Also  regarding to using 6 zookeeper nodes (3+3), is it fine to handle
failures as per 50% rule as if 3 are down my cluster should work or should
i move to having odd numbers such as 5 or 7 here?

Regards.

On Tue, Sep 16, 2014 at 4:26 AM, Flavio Junqueira <
fpjunqueira@yahoo.com.invalid> wrote:

> Instead of guessing, I think it is best if we understand what's going
> wrong with the servers, you need to look at the server logs. If you don't
> know how to get it, could you please share the command you're using to
> start servers?
>
> -Flavio
>
>
>
> On Monday, September 15, 2014 3:30 PM, lalit jangra <
> lalit.j.jangra@gmail.com> wrote:
>
>
> >
> >
> >Hello Flavio,
> >
> >Can this issue arise from system not having enough RAM for Java Heap as i
> >could see  my system is running on top of its RAM?
> >
> >Also is there any way to assign memory to zookeeper nodes?
> >
> >Regards.
> >
> >On Mon, Sep 15, 2014 at 7:37 PM, lalit jangra <lalit.j.jangra@gmail.com>
> >wrote:
> >
> >> Thanks Flavio,
> >>
> >> I am having 3+3 zookeeper nodes on two servers MCF1 & MCF2. Also i could
> >> see same error on both nodes. For logs into servers, i am not able to
> read
> >> anything from these, how can i read and interpret from zookeeper servers
> >> what is wrong?
> >>
> >> I have put different log & data directories for each of zookeeper, may
> be
> >> i should elaborate a bit more. I am deciding on names of logs & data
> >> directory as per myid (ranging from 1 to 6).
> >>
> >> ZK1 -> Data.1 -> Logs.1
> >> ZK2 -> Data.2 -> Logs.2
> >> ZK3 -> Data.3 -> Logs.3
> >> ZK4 -> Data.4 -> Logs.4
> >> ZK5 -> Data.5 -> Logs.5
> >> ZK6 -> Data.6 -> Logs.6
> >>
> >> As i have two servers only and i need to make it running on these two
> only
> >> so i chose this architecture. Also i am trying to make even for scenario
> >> where one node is down, i have only 3 zookeepers down so still second is
> >> working. If i have odd numbers say 5 or 7, if server with more numbers
> of
> >> zookeeper is down, its gone.
> >>
> >> Regards.
> >>
> >>
> >> On Mon, Sep 15, 2014 at 7:29 PM, Flavio Junqueira <
> >> fpjunqueira@yahoo.com.invalid> wrote:
> >>
> >>> I believe you have shared just the client-side errors, and I was
> >>> wondering what's going on with the servers. One problem I could spot
> with
> >>> the configuration is with the values of dataDir and dataLogDir. It
> looks
> >>> like the processes on the same node are writing to the same directory,
> >>> which should be confusing the servers.
> >>>
> >>> A couple of things about your setting. I'm not sure what your
> motivation
> >>> is to put multiple servers on the same node. It will induce correlated
> >>> crashes for the servers on the same node. Also, we in general
> recommend to
> >>> use an odd number of servers (5 or 7 for your case).
> >>>
> >>> -Flavio
> >>>
> >>> On Wednesday, September 10, 2014 6:29 AM, lalit jangra <
> >>> lalit.j.jangra@gmail.com> wrote:
> >>>
> >>>
> >>> >
> >>> >
> >>> >Hi,
> >>> >
> >>> >I am running cluster of two Apache ManifoldCF nodes on two separate
> >>> >machines each of which having 3 zookeeper instances (total 6
> instances in
> >>> >cluster). When i am running up manifoldCF agents, i see below warning
> >>> >during startup.
> >>> >
> >>> >[http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] INFO
> >>> >org.apache.zookeeper.ClientCnxn - Unable to read additional data from
> >>> >server sessionid 0x0, likely server has closed socket, closing socket
> >>> >connection and attempting reconnect
> >>> >
> >>> >[http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
> >>> >org.apache.zookeeper.ClientCnxn - Opening socket connection to server
> >>> >iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
> >>> >authenticate using SASL (unknown error)
> >>> >
> >>> >
> >>> >Also i could see below error in logs in while agents are running.
> >>> >
> >>> >[localhost-startStop-1-SendThread(iwdc1preecma03.iwater.ie:2183)]
> WARN
> >>> >org.apache.zookeeper.ClientCnxn - Session 0x6485a8006060079 for server
> >>> >iwdc1preecma03.iwater.ie/10.231.72.24:2183, unexpected error, closing
> >>> >socket connection and attempting reconnect
> >>> >
> >>> >java.io.IOException: Connection reset by peer
> >>> >
> >>> >        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> >>> >
> >>> >        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> >>> >
> >>> >        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
> >>> >
> >>> >        at sun.nio.ch.IOUtil.read(IOUtil.java:193)
> >>> >
> >>> >        at
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
> >>> >
> >>> >        at
> >>>
> >>>
> >org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
> >>> >
> >>> >        at
> >>>
> >>>
> >org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> >>> >
> >>> >        at
> >>> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> >>> >
> >>> >
> >>> >Below are configurations for 1. zookeeper nodes & 2. MCF nodes for
> >>> >zookeeper.
> >>> >
> >>> >
> >>> >*zoo.cfg :  Same for all six zookeeper nodes.*
> >>> >
> >>> >
> >>> ># The number of milliseconds of each tick
> >>> >
> >>> >tickTime=2000
> >>> >
> >>> >dataDir=/app/IW/zookeeper/data/data.1
> >>> >
> >>> >dataLogDir=/app/IW/zookeeper/logs/log.1
> >>> >
> >>> >clientPort=2181
> >>> >
> >>> >server.1=iwdc1preecma03:2888:3888
> >>> >
> >>> >server.2=iwdc1preecma03:2889:3889
> >>> >
> >>> >server.3=iwdc1preecma03:2890:3890
> >>> >
> >>> >server.4=iwdc2preecma04:2891:3891
> >>> >
> >>> >server.5=iwdc2preecma04:2892:3892
> >>> >
> >>> >server.6=iwdc2preecma04:2893:3893
> >>> >
> >>> ># The number of ticks that the initial
> >>> >
> >>> ># synchronization phase can take
> >>> >
> >>> >initLimit=10
> >>> >
> >>> ># The number of ticks that can pass between
> >>> >
> >>> ># sending a request and getting an acknowledgement
> >>> >
> >>> >syncLimit=5
> >>> >
> >>> ># the directory where the snapshot is stored.
> >>> >
> >>> ># do not use /tmp for storage, /tmp here is just
> >>> >
> >>> ># example sakes.
> >>> >
> >>> >#dataDir=/tmp/zookeeper
> >>> >
> >>> ># the port at which the clients will connect
> >>> >
> >>> >#clientPort=2181
> >>> >
> >>> ># the maximum number of client connections.
> >>> >
> >>> ># increase this if you need to handle more clients
> >>> >
> >>> >#maxClientCnxns=60
> >>> >
> >>> >#
> >>> >
> >>> ># Be sure to read the maintenance section of the
> >>> >
> >>> ># administrator guide before turning on autopurge.
> >>> >
> >>> >#
> >>> >
> >>> >#
> >>>
> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
> >>> >
> >>> >#
> >>> >
> >>> ># The number of snapshots to retain in dataDir
> >>> >
> >>> >autopurge.snapRetainCount=3
> >>> >
> >>> ># Purge task interval in hours
> >>> >
> >>> ># Set to "0" to disable auto purge feature
> >>> >
> >>> >autopurge.purgeInterval=1
> >>> >
> >>> >
> >>> >
> >>> >*ManifoldCF configurations : same for both ManifoldCF nodes.*
> >>> >
> >>> >
> >>> ><property name="org.apache.manifoldcf.lockmanagerclass"
> >>> >value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
> >>> >
> >>> >  <property name="org.apache.manifoldcf.zookeeper.connectstring"
> >>>
> >>>
> >value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
> >>> >
> >>> ><property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
> >>> >value="4000"/>
> >>> >
> >>> >
> >>> >
> >>> >*I want to know if due to above warnings/errors, will zookeeper stop
> >>> >working or will zookeeper will work and these are non-failing
> messages,
> >>> >because ManifoldCF jobs are stuck while i can see these errors.*
> >>> >
> >>> >Please suggest.
> >>> >
> >>> >Regards,
> >>> >Lalit.
> >
> >>> >
> >>> >
> >>> >
> >>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Lalit.
> >>
> >
> >
> >
> >--
> >Regards,
> >Lalit.
> >
> >
> >
>



-- 
Regards,
Lalit.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message