zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <fpjunque...@yahoo.com.INVALID>
Subject Re: Getting errors in zookeeper logs
Date Mon, 15 Sep 2014 22:56:17 GMT
Instead of guessing, I think it is best if we understand what's going wrong with the servers,
you need to look at the server logs. If you don't know how to get it, could you please share
the command you're using to start servers?

-Flavio



On Monday, September 15, 2014 3:30 PM, lalit jangra <lalit.j.jangra@gmail.com> wrote:
 

>
>
>Hello Flavio,
>
>Can this issue arise from system not having enough RAM for Java Heap as i
>could see  my system is running on top of its RAM?
>
>Also is there any way to assign memory to zookeeper nodes?
>
>Regards.
>
>On Mon, Sep 15, 2014 at 7:37 PM, lalit jangra <lalit.j.jangra@gmail.com>
>wrote:
>
>> Thanks Flavio,
>>
>> I am having 3+3 zookeeper nodes on two servers MCF1 & MCF2. Also i could
>> see same error on both nodes. For logs into servers, i am not able to read
>> anything from these, how can i read and interpret from zookeeper servers
>> what is wrong?
>>
>> I have put different log & data directories for each of zookeeper, may be
>> i should elaborate a bit more. I am deciding on names of logs & data
>> directory as per myid (ranging from 1 to 6).
>>
>> ZK1 -> Data.1 -> Logs.1
>> ZK2 -> Data.2 -> Logs.2
>> ZK3 -> Data.3 -> Logs.3
>> ZK4 -> Data.4 -> Logs.4
>> ZK5 -> Data.5 -> Logs.5
>> ZK6 -> Data.6 -> Logs.6
>>
>> As i have two servers only and i need to make it running on these two only
>> so i chose this architecture. Also i am trying to make even for scenario
>> where one node is down, i have only 3 zookeepers down so still second is
>> working. If i have odd numbers say 5 or 7, if server with more numbers of
>> zookeeper is down, its gone.
>>
>> Regards.
>>
>>
>> On Mon, Sep 15, 2014 at 7:29 PM, Flavio Junqueira <
>> fpjunqueira@yahoo.com.invalid> wrote:
>>
>>> I believe you have shared just the client-side errors, and I was
>>> wondering what's going on with the servers. One problem I could spot with
>>> the configuration is with the values of dataDir and dataLogDir. It looks
>>> like the processes on the same node are writing to the same directory,
>>> which should be confusing the servers.
>>>
>>> A couple of things about your setting. I'm not sure what your motivation
>>> is to put multiple servers on the same node. It will induce correlated
>>> crashes for the servers on the same node. Also, we in general recommend to
>>> use an odd number of servers (5 or 7 for your case).
>>>
>>> -Flavio
>>>
>>> On Wednesday, September 10, 2014 6:29 AM, lalit jangra <
>>> lalit.j.jangra@gmail.com> wrote:
>>>
>>>
>>> >
>>> >
>>> >Hi,
>>> >
>>> >I am running cluster of two Apache ManifoldCF nodes on two separate
>>> >machines each of which having 3 zookeeper instances (total 6 instances in
>>> >cluster). When i am running up manifoldCF agents, i see below warning
>>> >during startup.
>>> >
>>> >[http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] INFO
>>> >org.apache.zookeeper.ClientCnxn - Unable to read additional data from
>>> >server sessionid 0x0, likely server has closed socket, closing socket
>>> >connection and attempting reconnect
>>> >
>>> >[http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>> >org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>> >iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
>>> >authenticate using SASL (unknown error)
>>> >
>>> >
>>> >Also i could see below error in logs in while agents are running.
>>> >
>>> >[localhost-startStop-1-SendThread(iwdc1preecma03.iwater.ie:2183)] WARN
>>> >org.apache.zookeeper.ClientCnxn - Session 0x6485a8006060079 for server
>>> >iwdc1preecma03.iwater.ie/10.231.72.24:2183, unexpected error, closing
>>> >socket connection and attempting reconnect
>>> >
>>> >java.io.IOException: Connection reset by peer
>>> >
>>> >        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>> >
>>> >        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>> >
>>> >        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
>>> >
>>> >        at sun.nio.ch.IOUtil.read(IOUtil.java:193)
>>> >
>>> >        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
>>> >
>>> >        at
>>>
>>> >org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
>>> >
>>> >        at
>>>
>>> >org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>>> >
>>> >        at
>>> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>> >
>>> >
>>> >Below are configurations for 1. zookeeper nodes & 2. MCF nodes for
>>> >zookeeper.
>>> >
>>> >
>>> >*zoo.cfg :  Same for all six zookeeper nodes.*
>>> >
>>> >
>>> ># The number of milliseconds of each tick
>>> >
>>> >tickTime=2000
>>> >
>>> >dataDir=/app/IW/zookeeper/data/data.1
>>> >
>>> >dataLogDir=/app/IW/zookeeper/logs/log.1
>>> >
>>> >clientPort=2181
>>> >
>>> >server.1=iwdc1preecma03:2888:3888
>>> >
>>> >server.2=iwdc1preecma03:2889:3889
>>> >
>>> >server.3=iwdc1preecma03:2890:3890
>>> >
>>> >server.4=iwdc2preecma04:2891:3891
>>> >
>>> >server.5=iwdc2preecma04:2892:3892
>>> >
>>> >server.6=iwdc2preecma04:2893:3893
>>> >
>>> ># The number of ticks that the initial
>>> >
>>> ># synchronization phase can take
>>> >
>>> >initLimit=10
>>> >
>>> ># The number of ticks that can pass between
>>> >
>>> ># sending a request and getting an acknowledgement
>>> >
>>> >syncLimit=5
>>> >
>>> ># the directory where the snapshot is stored.
>>> >
>>> ># do not use /tmp for storage, /tmp here is just
>>> >
>>> ># example sakes.
>>> >
>>> >#dataDir=/tmp/zookeeper
>>> >
>>> ># the port at which the clients will connect
>>> >
>>> >#clientPort=2181
>>> >
>>> ># the maximum number of client connections.
>>> >
>>> ># increase this if you need to handle more clients
>>> >
>>> >#maxClientCnxns=60
>>> >
>>> >#
>>> >
>>> ># Be sure to read the maintenance section of the
>>> >
>>> ># administrator guide before turning on autopurge.
>>> >
>>> >#
>>> >
>>> >#
>>> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
>>> >
>>> >#
>>> >
>>> ># The number of snapshots to retain in dataDir
>>> >
>>> >autopurge.snapRetainCount=3
>>> >
>>> ># Purge task interval in hours
>>> >
>>> ># Set to "0" to disable auto purge feature
>>> >
>>> >autopurge.purgeInterval=1
>>> >
>>> >
>>> >
>>> >*ManifoldCF configurations : same for both ManifoldCF nodes.*
>>> >
>>> >
>>> ><property name="org.apache.manifoldcf.lockmanagerclass"
>>> >value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
>>> >
>>> >  <property name="org.apache.manifoldcf.zookeeper.connectstring"
>>>
>>> >value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
>>> >
>>> ><property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
>>> >value="4000"/>
>>> >
>>> >
>>> >
>>> >*I want to know if due to above warnings/errors, will zookeeper stop
>>> >working or will zookeeper will work and these are non-failing messages,
>>> >because ManifoldCF jobs are stuck while i can see these errors.*
>>> >
>>> >Please suggest.
>>> >
>>> >Regards,
>>> >Lalit.
>
>>> >
>>> >
>>> >
>>
>>
>>
>>
>> --
>> Regards,
>> Lalit.
>>
>
>
>
>-- 
>Regards,
>Lalit.
>
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message