zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <fpjunque...@yahoo.com.INVALID>
Subject Re: Getting errors in zookeeper logs
Date Tue, 16 Sep 2014 08:53:20 GMT
It seems fine to limit the amount of memory, but I'd rather get the error that is preventing
the servers from starting/making progress first.

-Flavio


On Tuesday, September 16, 2014 9:48 AM, lalit jangra <lalit.j.jangra@gmail.com> wrote:
 

>
>
>Thanks Flavio,
>
>I will try and update. Can you confirm if i add java.env under conf folder
>with JVM settings as "-Xms 1024m -Xmx1024m" , it will help to limit memory
>size of zookeeper till 1 G only?
>
>Regards.
>
>On Tue, Sep 16, 2014 at 2:05 PM, Flavio Junqueira <
>fpjunqueira@yahoo.com.invalid> wrote:
>
>> What if you use 'zkServer.sh start-foreground' to debug?
>>
>> -Flavio
>>
>>
>> On Tuesday, September 16, 2014 5:20 AM, lalit jangra <
>> lalit.j.jangra@gmail.com> wrote:
>>
>>
>> >
>> >
>> >Hello Flavio,
>> >
>> >I am using 'zkServer.sh start' command to start zookeeper nodes. I also
>> >could see logs in log folders in have specified but these logs are in a
>> >form which is difficult to understand.
>> >
>> >Also  regarding to using 6 zookeeper nodes (3+3), is it fine to handle
>> >failures as per 50% rule as if 3 are down my cluster should work or should
>> >i move to having odd numbers such as 5 or 7 here?
>> >
>> >Regards.
>> >
>> >On Tue, Sep 16, 2014 at 4:26 AM, Flavio Junqueira <
>> >fpjunqueira@yahoo.com.invalid> wrote:
>> >
>> >> Instead of guessing, I think it is best if we understand what's going
>> >> wrong with the servers, you need to look at the server logs. If you
>> don't
>> >> know how to get it, could you please share the command you're using to
>> >> start servers?
>> >>
>> >> -Flavio
>> >>
>> >>
>> >>
>> >> On Monday, September 15, 2014 3:30 PM, lalit jangra <
>> >> lalit.j.jangra@gmail.com> wrote:
>> >>
>> >>
>> >> >
>> >> >
>> >> >Hello Flavio,
>> >> >
>> >> >Can this issue arise from system not having enough RAM for Java Heap
>> as i
>> >> >could see  my system is running on top of its RAM?
>> >> >
>> >> >Also is there any way to assign memory to zookeeper nodes?
>> >> >
>> >> >Regards.
>> >> >
>> >> >On Mon, Sep 15, 2014 at 7:37 PM, lalit jangra <
>> lalit.j.jangra@gmail.com>
>> >> >wrote:
>> >> >
>> >> >> Thanks Flavio,
>> >> >>
>> >> >> I am having 3+3 zookeeper nodes on two servers MCF1 & MCF2.
Also i
>> could
>> >> >> see same error on both nodes. For logs into servers, i am not able
to
>> >> read
>> >> >> anything from these, how can i read and interpret from zookeeper
>> servers
>> >> >> what is wrong?
>> >> >>
>> >> >> I have put different log & data directories for each of zookeeper,
>> may
>> >> be
>> >> >> i should elaborate a bit more. I am deciding on names of logs &
data
>> >> >> directory as per myid (ranging from 1 to 6).
>> >> >>
>> >> >> ZK1 -> Data.1 -> Logs.1
>> >> >> ZK2 -> Data.2 -> Logs.2
>> >> >> ZK3 -> Data.3 -> Logs.3
>> >> >> ZK4 -> Data.4 -> Logs.4
>> >> >> ZK5 -> Data.5 -> Logs.5
>> >> >> ZK6 -> Data.6 -> Logs.6
>> >> >>
>> >> >> As i have two servers only and i need to make it running on these
two
>> >> only
>> >> >> so i chose this architecture. Also i am trying to make even for
>> scenario
>> >> >> where one node is down, i have only 3 zookeepers down so still
>> second is
>> >> >> working. If i have odd numbers say 5 or 7, if server with more
>> numbers
>> >> of
>> >> >> zookeeper is down, its gone.
>> >> >>
>> >> >> Regards.
>> >> >>
>> >> >>
>> >> >> On Mon, Sep 15, 2014 at 7:29 PM, Flavio Junqueira <
>> >> >> fpjunqueira@yahoo.com.invalid> wrote:
>> >> >>
>> >> >>> I believe you have shared just the client-side errors, and
I was
>> >> >>> wondering what's going on with the servers. One problem I could
spot
>> >> with
>> >> >>> the configuration is with the values of dataDir and dataLogDir.
It
>> >> looks
>> >> >>> like the processes on the same node are writing to the same
>> directory,
>> >> >>> which should be confusing the servers.
>> >> >>>
>> >> >>> A couple of things about your setting. I'm not sure what your
>> >> motivation
>> >> >>> is to put multiple servers on the same node. It will induce
>> correlated
>> >> >>> crashes for the servers on the same node. Also, we in general
>> >> recommend to
>> >> >>> use an odd number of servers (5 or 7 for your case).
>> >> >>>
>> >> >>> -Flavio
>> >> >>>
>> >> >>> On Wednesday, September 10, 2014 6:29 AM, lalit jangra <
>> >> >>> lalit.j.jangra@gmail.com> wrote:
>> >> >>>
>> >> >>>
>> >> >>> >
>> >> >>> >
>> >> >>> >Hi,
>> >> >>> >
>> >> >>> >I am running cluster of two Apache ManifoldCF nodes on
two separate
>> >> >>> >machines each of which having 3 zookeeper instances (total
6
>> >> instances in
>> >> >>> >cluster). When i am running up manifoldCF agents, i see
below
>> warning
>> >> >>> >during startup.
>> >> >>> >
>> >> >>> >[http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)]
>> INFO
>> >> >>> >org.apache.zookeeper.ClientCnxn - Unable to read additional
data
>> from
>> >> >>> >server sessionid 0x0, likely server has closed socket,
closing
>> socket
>> >> >>> >connection and attempting reconnect
>> >> >>> >
>> >> >>> >[http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)]
>> INFO
>> >> >>> >org.apache.zookeeper.ClientCnxn - Opening socket connection
to
>> server
>> >> >>> >iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt
to
>> >> >>> >authenticate using SASL (unknown error)
>> >> >>> >
>> >> >>> >
>> >> >>> >Also i could see below error in logs in while agents are
running.
>> >> >>> >
>> >> >>> >[localhost-startStop-1-SendThread(iwdc1preecma03.iwater.ie:2183)]
>> >> WARN
>> >> >>> >org.apache.zookeeper.ClientCnxn - Session 0x6485a8006060079
for
>> server
>> >> >>> >iwdc1preecma03.iwater.ie/10.231.72.24:2183, unexpected
error,
>> closing
>> >> >>> >socket connection and attempting reconnect
>> >> >>> >
>> >> >>> >java.io.IOException: Connection reset by peer
>> >> >>> >
>> >> >>> >        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> >> >>> >
>> >> >>> >        at
>> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>> >> >>> >
>> >> >>> >        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
>> >> >>> >
>> >> >>> >        at sun.nio.ch.IOUtil.read(IOUtil.java:193)
>> >> >>> >
>> >> >>> >        at
>> >> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
>> >> >>> >
>> >> >>> >        at
>> >> >>>
>> >> >>>
>> >>
>> >org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
>> >> >>> >
>> >> >>> >        at
>> >> >>>
>> >> >>>
>> >>
>> >org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>> >> >>> >
>> >> >>> >        at
>> >> >>>
>> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>> >> >>> >
>> >> >>> >
>> >> >>> >Below are configurations for 1. zookeeper nodes & 2.
MCF nodes for
>> >> >>> >zookeeper.
>> >> >>> >
>> >> >>> >
>> >> >>> >*zoo.cfg :  Same for all six zookeeper nodes.*
>> >> >>> >
>> >> >>> >
>> >> >>> ># The number of milliseconds of each tick
>> >> >>> >
>> >> >>> >tickTime=2000
>> >> >>> >
>> >> >>> >dataDir=/app/IW/zookeeper/data/data.1
>> >> >>> >
>> >> >>> >dataLogDir=/app/IW/zookeeper/logs/log.1
>> >> >>> >
>> >> >>> >clientPort=2181
>> >> >>> >
>> >> >>> >server.1=iwdc1preecma03:2888:3888
>> >> >>> >
>> >> >>> >server.2=iwdc1preecma03:2889:3889
>> >> >>> >
>> >> >>> >server.3=iwdc1preecma03:2890:3890
>> >> >>> >
>> >> >>> >server.4=iwdc2preecma04:2891:3891
>> >> >>> >
>> >> >>> >server.5=iwdc2preecma04:2892:3892
>> >> >>> >
>> >> >>> >server.6=iwdc2preecma04:2893:3893
>> >> >>> >
>> >> >>> ># The number of ticks that the initial
>> >> >>> >
>> >> >>> ># synchronization phase can take
>> >> >>> >
>> >> >>> >initLimit=10
>> >> >>> >
>> >> >>> ># The number of ticks that can pass between
>> >> >>> >
>> >> >>> ># sending a request and getting an acknowledgement
>> >> >>> >
>> >> >>> >syncLimit=5
>> >> >>> >
>> >> >>> ># the directory where the snapshot is stored.
>> >> >>> >
>> >> >>> ># do not use /tmp for storage, /tmp here is just
>> >> >>> >
>> >> >>> ># example sakes.
>> >> >>> >
>> >> >>> >#dataDir=/tmp/zookeeper
>> >> >>> >
>> >> >>> ># the port at which the clients will connect
>> >> >>> >
>> >> >>> >#clientPort=2181
>> >> >>> >
>> >> >>> ># the maximum number of client connections.
>> >> >>> >
>> >> >>> ># increase this if you need to handle more clients
>> >> >>> >
>> >> >>> >#maxClientCnxns=60
>> >> >>> >
>> >> >>> >#
>> >> >>> >
>> >> >>> ># Be sure to read the maintenance section of the
>> >> >>> >
>> >> >>> ># administrator guide before turning on autopurge.
>> >> >>> >
>> >> >>> >#
>> >> >>> >
>> >> >>> >#
>> >> >>>
>> >>
>> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
>> >> >>> >
>> >> >>> >#
>> >> >>> >
>> >> >>> ># The number of snapshots to retain in dataDir
>> >> >>> >
>> >> >>> >autopurge.snapRetainCount=3
>> >> >>> >
>> >> >>> ># Purge task interval in hours
>> >> >>> >
>> >> >>> ># Set to "0" to disable auto purge feature
>> >> >>> >
>> >> >>> >autopurge.purgeInterval=1
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> >*ManifoldCF configurations : same for both ManifoldCF nodes.*
>> >> >>> >
>> >> >>> >
>> >> >>> ><property name="org.apache.manifoldcf.lockmanagerclass"
>> >> >>>
>> >value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
>> >> >>> >
>> >> >>> >  <property name="org.apache.manifoldcf.zookeeper.connectstring"
>> >> >>>
>> >> >>>
>> >>
>> >value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
>> >> >>> >
>> >> >>> ><property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
>> >> >>> >value="4000"/>
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> >*I want to know if due to above warnings/errors, will zookeeper
>> stop
>> >> >>> >working or will zookeeper will work and these are non-failing
>> >> messages,
>> >> >>> >because ManifoldCF jobs are stuck while i can see these
errors.*
>> >> >>> >
>> >> >>> >Please suggest.
>> >> >>> >
>> >> >>> >Regards,
>> >> >>> >Lalit.
>
>> >
>> >> >
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Regards,
>> >> >> Lalit.
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> >--
>> >> >Regards,
>> >> >Lalit.
>> >> >
>> >> >
>> >> >
>> >>
>> >
>> >
>> >
>> >--
>> >Regards,
>> >Lalit.
>> >
>> >
>> >
>>
>
>
>
>-- 
>Regards,
>Lalit.
>
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message