zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lalit jangra <lalit.j.jan...@gmail.com>
Subject Re: Getting errors in zookeeper logs
Date Tue, 16 Sep 2014 08:41:40 GMT
Thanks Flavio,

I will try and update. Can you confirm if i add java.env under conf folder
with JVM settings as "-Xms 1024m -Xmx1024m" , it will help to limit memory
size of zookeeper till 1 G only?

Regards.

On Tue, Sep 16, 2014 at 2:05 PM, Flavio Junqueira <
fpjunqueira@yahoo.com.invalid> wrote:

> What if you use 'zkServer.sh start-foreground' to debug?
>
> -Flavio
>
>
> On Tuesday, September 16, 2014 5:20 AM, lalit jangra <
> lalit.j.jangra@gmail.com> wrote:
>
>
> >
> >
> >Hello Flavio,
> >
> >I am using 'zkServer.sh start' command to start zookeeper nodes. I also
> >could see logs in log folders in have specified but these logs are in a
> >form which is difficult to understand.
> >
> >Also  regarding to using 6 zookeeper nodes (3+3), is it fine to handle
> >failures as per 50% rule as if 3 are down my cluster should work or should
> >i move to having odd numbers such as 5 or 7 here?
> >
> >Regards.
> >
> >On Tue, Sep 16, 2014 at 4:26 AM, Flavio Junqueira <
> >fpjunqueira@yahoo.com.invalid> wrote:
> >
> >> Instead of guessing, I think it is best if we understand what's going
> >> wrong with the servers, you need to look at the server logs. If you
> don't
> >> know how to get it, could you please share the command you're using to
> >> start servers?
> >>
> >> -Flavio
> >>
> >>
> >>
> >> On Monday, September 15, 2014 3:30 PM, lalit jangra <
> >> lalit.j.jangra@gmail.com> wrote:
> >>
> >>
> >> >
> >> >
> >> >Hello Flavio,
> >> >
> >> >Can this issue arise from system not having enough RAM for Java Heap
> as i
> >> >could see  my system is running on top of its RAM?
> >> >
> >> >Also is there any way to assign memory to zookeeper nodes?
> >> >
> >> >Regards.
> >> >
> >> >On Mon, Sep 15, 2014 at 7:37 PM, lalit jangra <
> lalit.j.jangra@gmail.com>
> >> >wrote:
> >> >
> >> >> Thanks Flavio,
> >> >>
> >> >> I am having 3+3 zookeeper nodes on two servers MCF1 & MCF2. Also
i
> could
> >> >> see same error on both nodes. For logs into servers, i am not able
to
> >> read
> >> >> anything from these, how can i read and interpret from zookeeper
> servers
> >> >> what is wrong?
> >> >>
> >> >> I have put different log & data directories for each of zookeeper,
> may
> >> be
> >> >> i should elaborate a bit more. I am deciding on names of logs &
data
> >> >> directory as per myid (ranging from 1 to 6).
> >> >>
> >> >> ZK1 -> Data.1 -> Logs.1
> >> >> ZK2 -> Data.2 -> Logs.2
> >> >> ZK3 -> Data.3 -> Logs.3
> >> >> ZK4 -> Data.4 -> Logs.4
> >> >> ZK5 -> Data.5 -> Logs.5
> >> >> ZK6 -> Data.6 -> Logs.6
> >> >>
> >> >> As i have two servers only and i need to make it running on these two
> >> only
> >> >> so i chose this architecture. Also i am trying to make even for
> scenario
> >> >> where one node is down, i have only 3 zookeepers down so still
> second is
> >> >> working. If i have odd numbers say 5 or 7, if server with more
> numbers
> >> of
> >> >> zookeeper is down, its gone.
> >> >>
> >> >> Regards.
> >> >>
> >> >>
> >> >> On Mon, Sep 15, 2014 at 7:29 PM, Flavio Junqueira <
> >> >> fpjunqueira@yahoo.com.invalid> wrote:
> >> >>
> >> >>> I believe you have shared just the client-side errors, and I was
> >> >>> wondering what's going on with the servers. One problem I could
spot
> >> with
> >> >>> the configuration is with the values of dataDir and dataLogDir.
It
> >> looks
> >> >>> like the processes on the same node are writing to the same
> directory,
> >> >>> which should be confusing the servers.
> >> >>>
> >> >>> A couple of things about your setting. I'm not sure what your
> >> motivation
> >> >>> is to put multiple servers on the same node. It will induce
> correlated
> >> >>> crashes for the servers on the same node. Also, we in general
> >> recommend to
> >> >>> use an odd number of servers (5 or 7 for your case).
> >> >>>
> >> >>> -Flavio
> >> >>>
> >> >>> On Wednesday, September 10, 2014 6:29 AM, lalit jangra <
> >> >>> lalit.j.jangra@gmail.com> wrote:
> >> >>>
> >> >>>
> >> >>> >
> >> >>> >
> >> >>> >Hi,
> >> >>> >
> >> >>> >I am running cluster of two Apache ManifoldCF nodes on two
separate
> >> >>> >machines each of which having 3 zookeeper instances (total
6
> >> instances in
> >> >>> >cluster). When i am running up manifoldCF agents, i see below
> warning
> >> >>> >during startup.
> >> >>> >
> >> >>> >[http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)]
> INFO
> >> >>> >org.apache.zookeeper.ClientCnxn - Unable to read additional
data
> from
> >> >>> >server sessionid 0x0, likely server has closed socket, closing
> socket
> >> >>> >connection and attempting reconnect
> >> >>> >
> >> >>> >[http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)]
> INFO
> >> >>> >org.apache.zookeeper.ClientCnxn - Opening socket connection
to
> server
> >> >>> >iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt
to
> >> >>> >authenticate using SASL (unknown error)
> >> >>> >
> >> >>> >
> >> >>> >Also i could see below error in logs in while agents are running.
> >> >>> >
> >> >>> >[localhost-startStop-1-SendThread(iwdc1preecma03.iwater.ie:2183)]
> >> WARN
> >> >>> >org.apache.zookeeper.ClientCnxn - Session 0x6485a8006060079
for
> server
> >> >>> >iwdc1preecma03.iwater.ie/10.231.72.24:2183, unexpected error,
> closing
> >> >>> >socket connection and attempting reconnect
> >> >>> >
> >> >>> >java.io.IOException: Connection reset by peer
> >> >>> >
> >> >>> >        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> >> >>> >
> >> >>> >        at
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> >> >>> >
> >> >>> >        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
> >> >>> >
> >> >>> >        at sun.nio.ch.IOUtil.read(IOUtil.java:193)
> >> >>> >
> >> >>> >        at
> >> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
> >> >>> >
> >> >>> >        at
> >> >>>
> >> >>>
> >>
> >org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
> >> >>> >
> >> >>> >        at
> >> >>>
> >> >>>
> >>
> >org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> >> >>> >
> >> >>> >        at
> >> >>>
> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> >> >>> >
> >> >>> >
> >> >>> >Below are configurations for 1. zookeeper nodes & 2. MCF
nodes for
> >> >>> >zookeeper.
> >> >>> >
> >> >>> >
> >> >>> >*zoo.cfg :  Same for all six zookeeper nodes.*
> >> >>> >
> >> >>> >
> >> >>> ># The number of milliseconds of each tick
> >> >>> >
> >> >>> >tickTime=2000
> >> >>> >
> >> >>> >dataDir=/app/IW/zookeeper/data/data.1
> >> >>> >
> >> >>> >dataLogDir=/app/IW/zookeeper/logs/log.1
> >> >>> >
> >> >>> >clientPort=2181
> >> >>> >
> >> >>> >server.1=iwdc1preecma03:2888:3888
> >> >>> >
> >> >>> >server.2=iwdc1preecma03:2889:3889
> >> >>> >
> >> >>> >server.3=iwdc1preecma03:2890:3890
> >> >>> >
> >> >>> >server.4=iwdc2preecma04:2891:3891
> >> >>> >
> >> >>> >server.5=iwdc2preecma04:2892:3892
> >> >>> >
> >> >>> >server.6=iwdc2preecma04:2893:3893
> >> >>> >
> >> >>> ># The number of ticks that the initial
> >> >>> >
> >> >>> ># synchronization phase can take
> >> >>> >
> >> >>> >initLimit=10
> >> >>> >
> >> >>> ># The number of ticks that can pass between
> >> >>> >
> >> >>> ># sending a request and getting an acknowledgement
> >> >>> >
> >> >>> >syncLimit=5
> >> >>> >
> >> >>> ># the directory where the snapshot is stored.
> >> >>> >
> >> >>> ># do not use /tmp for storage, /tmp here is just
> >> >>> >
> >> >>> ># example sakes.
> >> >>> >
> >> >>> >#dataDir=/tmp/zookeeper
> >> >>> >
> >> >>> ># the port at which the clients will connect
> >> >>> >
> >> >>> >#clientPort=2181
> >> >>> >
> >> >>> ># the maximum number of client connections.
> >> >>> >
> >> >>> ># increase this if you need to handle more clients
> >> >>> >
> >> >>> >#maxClientCnxns=60
> >> >>> >
> >> >>> >#
> >> >>> >
> >> >>> ># Be sure to read the maintenance section of the
> >> >>> >
> >> >>> ># administrator guide before turning on autopurge.
> >> >>> >
> >> >>> >#
> >> >>> >
> >> >>> >#
> >> >>>
> >>
> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
> >> >>> >
> >> >>> >#
> >> >>> >
> >> >>> ># The number of snapshots to retain in dataDir
> >> >>> >
> >> >>> >autopurge.snapRetainCount=3
> >> >>> >
> >> >>> ># Purge task interval in hours
> >> >>> >
> >> >>> ># Set to "0" to disable auto purge feature
> >> >>> >
> >> >>> >autopurge.purgeInterval=1
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> >*ManifoldCF configurations : same for both ManifoldCF nodes.*
> >> >>> >
> >> >>> >
> >> >>> ><property name="org.apache.manifoldcf.lockmanagerclass"
> >> >>>
> >value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
> >> >>> >
> >> >>> >  <property name="org.apache.manifoldcf.zookeeper.connectstring"
> >> >>>
> >> >>>
> >>
> >value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
> >> >>> >
> >> >>> ><property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
> >> >>> >value="4000"/>
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> >*I want to know if due to above warnings/errors, will zookeeper
> stop
> >> >>> >working or will zookeeper will work and these are non-failing
> >> messages,
> >> >>> >because ManifoldCF jobs are stuck while i can see these errors.*
> >> >>> >
> >> >>> >Please suggest.
> >> >>> >
> >> >>> >Regards,
> >> >>> >Lalit.
> >
> >> >
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Regards,
> >> >> Lalit.
> >> >>
> >> >
> >> >
> >> >
> >> >--
> >> >Regards,
> >> >Lalit.
> >> >
> >> >
> >> >
> >>
> >
> >
> >
> >--
> >Regards,
> >Lalit.
> >
> >
> >
>



-- 
Regards,
Lalit.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message