manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Getting errors in zookeeper logs
Date Mon, 15 Sep 2014 13:21:13 GMT
HI Lalit,

When MCF cannot reach zookeeper, MCF crawls will pause until the zookeeper
connections are reestablished.  Then the crawls should resume.  This should
*not* abort your crawls, but it will make them very slow.

I am not a zookeeper expert, so I would post on their message boards to see
if there is any adjustment that can be made to zookeeper parameters that
would improve zookeeper behavior when you have a flaky network.  However,
since the obvious solution is to fix your network, they may not have a code
solution for you.

Thanks,
Karl


On Mon, Sep 15, 2014 at 9:15 AM, lalit jangra <lalit.j.jangra@gmail.com>
wrote:

> Thanks Karl,
>
> Ideally resetting connections should be taken care by zookeeper itself as
> i could see re-establishment of connections later in logs.
>
> Can you suggest any way to overcome this in addition to network issue
> resolution as my crawls are not working again and again? Anything in config
> files etc.?
>
> Regards.
>
>
> On Mon, Sep 15, 2014 at 6:39 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Lalit,
>>
>> Zookeeper will keep working, but you should understand that you are
>> dropping connections to your zookeeper members for unknown reasons, which
>> is causing your crawl to stall when it happens.  This argues that perhaps
>> you have some network flakiness of some kind.
>>
>> Karl
>>
>>
>> On Mon, Sep 15, 2014 at 8:59 AM, lalit jangra <lalit.j.jangra@gmail.com>
>> wrote:
>>
>>>
>>> Hi,
>>>
>>> I am running cluster of two Apache ManifoldCF nodes on two separate
>>> machines each of which having 3 zookeeper instances (total 6 instances in
>>> cluster). When i am running up manifoldCF agents, i see below warning
>>> during startup.
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] INFO
>>> org.apache.zookeeper.ClientCnxn - Unable to read additional data from
>>> server sessionid 0x0, likely server has closed socket, closing socket
>>> connection and attempting reconnect
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
>>> authenticate using SASL (unknown error)
>>>
>>>
>>> Also i could see below error in logs in while agents are running.
>>>
>>> [http-bio-80-exec-2] INFO org.apache.zookeeper.ZooKeeper - Initiating
>>> client connection,
>>> connectString=iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183
>>> sessionTimeout=4000
>>> watcher=org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection$ZooKeeperWatcher@51d83fd7
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
>>> authenticate using SASL (unknown error)
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, initiating session
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] WARN
>>> org.apache.zookeeper.ClientCnxn - Session 0x0 for server
>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, unexpected error, closing
>>> socket connection and attempting reconnect
>>>
>>> java.io.IOException: Connection reset by peer
>>>
>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>
>>>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>
>>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
>>>
>>>         at sun.nio.ch.IOUtil.read(IOUtil.java:193)
>>>
>>>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
>>>
>>>         at
>>> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
>>>
>>>         at
>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>>>
>>>         at
>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183. Will not attempt to
>>> authenticate using SASL (unknown error)
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183, initiating session
>>>
>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>> org.apache.zookeeper.ClientCnxn - Session establishment complete on server
>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183, sessionid =
>>> 0x6487851bd330078, negotiated timeout = 4000
>>>
>>>
>>> Below are configurations for 1. zookeeper nodes & 2. MCF nodes for
>>> zookeeper.
>>>
>>>
>>> *zoo.cfg :  Same for all six zookeeper nodes.*
>>>
>>>
>>> # The number of milliseconds of each tick
>>>
>>> tickTime=2000
>>>
>>> dataDir=/app/IW/zookeeper/data/data.1
>>>
>>> dataLogDir=/app/IW/zookeeper/logs/log.1
>>>
>>> clientPort=2181
>>>
>>> server.1=iwdc1preecma03:2888:3888
>>>
>>> server.2=iwdc1preecma03:2889:3889
>>>
>>> server.3=iwdc1preecma03:2890:3890
>>>
>>> server.4=iwdc2preecma04:2891:3891
>>>
>>> server.5=iwdc2preecma04:2892:3892
>>>
>>> server.6=iwdc2preecma04:2893:3893
>>>
>>> # The number of ticks that the initial
>>>
>>> # synchronization phase can take
>>>
>>> initLimit=10
>>>
>>> # The number of ticks that can pass between
>>>
>>> # sending a request and getting an acknowledgement
>>>
>>> syncLimit=5
>>>
>>> # the directory where the snapshot is stored.
>>>
>>> # do not use /tmp for storage, /tmp here is just
>>>
>>> # example sakes.
>>>
>>> #dataDir=/tmp/zookeeper
>>>
>>> # the port at which the clients will connect
>>>
>>> #clientPort=2181
>>>
>>> # the maximum number of client connections.
>>>
>>> # increase this if you need to handle more clients
>>>
>>> #maxClientCnxns=60
>>>
>>> #
>>>
>>> # Be sure to read the maintenance section of the
>>>
>>> # administrator guide before turning on autopurge.
>>>
>>> #
>>>
>>> #
>>> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
>>>
>>> #
>>>
>>> # The number of snapshots to retain in dataDir
>>>
>>> autopurge.snapRetainCount=3
>>>
>>> # Purge task interval in hours
>>>
>>> # Set to "0" to disable auto purge feature
>>>
>>> autopurge.purgeInterval=1
>>>
>>>
>>>
>>> *ManifoldCF configurations : same for both ManifoldCF nodes.*
>>>
>>>
>>> <property name="org.apache.manifoldcf.lockmanagerclass"
>>> value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
>>>
>>>   <property name="org.apache.manifoldcf.zookeeper.connectstring"
>>> value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
>>>
>>> <property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
>>> value="4000"/>
>>>
>>>
>>>
>>> *I want to know if due to above warnings/errors, will zookeeper stop
>>> working or will zookeeper will work and these are non-failing messages,
>>> because ManifoldCF jobs are stuck while i can see these errors.*
>>>
>>> Please suggest.
>>>
>>> Regards,
>>> Lalit.
>>>
>>>
>>
>
>
> --
> Regards,
> Lalit.
>

Mime
View raw message