incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: S4 dies out after a day
Date Tue, 07 Feb 2012 06:34:38 GMT
probably because of GC pause. 5 is definitely aggressive unless you really
need such a quick fail over you can change the timeout to 15 seconds.

you probably want to turn on gc logs if you havent already and see if there
are long pauses

On Mon, Feb 6, 2012 at 8:51 PM, J Mohamed Zahoor <jmozah@gmail.com> wrote:

> Hi
>
> I am using s4 0.3.0 version with dynamic configuration (external zookeeper
> quorum).
> I am encountering application restart when S4 is up for long times..
> It is because of the zookeeper session timeout..
>
>
> My timeout in file s4-core.properties is "zk_session_timeout=5000" .
> Below is the log snippet...
>
>
> 2012-02-07 06:50:02,769 io.s4.comm.zk.ZkProcessMonitor INFO
> (ZkProcessMonitor.java:110) Setting watch on /s4/s4/process
> 2012-02-07 06:50:03,996 org.apache.zookeeper.ClientCnxn INFO
> (ClientCnxn.java:1118) Unable to read additional data from server sessionid
> 0x34cd07724608a3, likely server has closed socket, closi
> 2012-02-07 06:50:04,097 io.s4.comm.core.DefaultWatcher INFO
> (DefaultWatcher.java:87) Received zk event:WatchedEvent state:Disconnected
> type:None path:null
> 2012-02-07 06:50:04,382 org.apache.zookeeper.ClientCnxn INFO
> (ClientCnxn.java:1000) Opening socket connection to server
> master.ch.bd.net/10.184.17.10:2181
> 2012-02-07 06:50:04,383 org.apache.zookeeper.ClientCnxn INFO
> (ClientCnxn.java:908) Socket connection established to
> master.ch.bd.net/10.184.17.10:2181, initiating session
> 2012-02-07 06:50:04,385 io.s4.comm.core.DefaultWatcher INFO
> (DefaultWatcher.java:87) Received zk event:WatchedEvent state:Expired
> type:None path:null
> 2012-02-07 06:50:04,385 org.apache.zookeeper.ClientCnxn INFO
> (ClientCnxn.java:1114) Unable to reconnect to ZooKeeper service, session
> 0x34cd07724608a7 has expired, closing socket connection
> 2012-02-07 06:50:04,486 io.s4.comm.zk.ZkProcessMonitor WARN
> (ZkProcessMonitor.java:117) KeeperException in ProcessMonitor.run
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /s4/s4/process
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1271)
>         at io.s4.comm.zk.ZkProcessMonitor.run(ZkProcessMonitor.java:111)
>         at java.lang.Thread.run(Thread.java:636)
> 2012-02-07 06:50:06,079 org.apache.zookeeper.ClientCnxn INFO
> (ClientCnxn.java:1000) Opening socket connection to server
> master.ch.bd.net/10.184.17.10:2181
> 2012-02-07 06:50:06,080 org.apache.zookeeper.ClientCnxn INFO
> (ClientCnxn.java:908) Socket connection established to
> master.ch.bd.net/10.184.17.10:2181, initiating session
> 2012-02-07 06:50:06,081 org.apache.zookeeper.ClientCnxn INFO
> (ClientCnxn.java:1114) Unable to reconnect to ZooKeeper service, session
> 0x34cd07724608a3 has expired, closing socket connection
> 2012-02-07 06:50:06,081 io.s4.comm.core.DefaultWatcher INFO
> (DefaultWatcher.java:87) Received zk event:WatchedEvent state:Expired
> type:None path:null
> 2012-02-07 06:50:06,081 io.s4.listener.CommLayerListener ERROR
> (CommLayerListener.java:121) Communication layer broken:
> source:WatchedEvent state:Expired type:None path:null
> 2012-02-07 06:50:06,081 io.s4.listener.CommLayerListener ERROR
> (CommLayerListener.java:123) System exiting so that process can restart.
>
>
> any help to prolong this session will be appreciated.
>
> ./zahoor
>

Mime
View raw message