incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J Mohamed Zahoor <jmo...@gmail.com>
Subject Re: S4 dies out after a day
Date Tue, 07 Feb 2012 14:58:50 GMT
Hi

The system was fairly free ( no events for at least 10 hours)  for GC to get triggered...
Anyway I will turn on the GC logs and revert back...

./Zahoor@iPad

On 07-Feb-2012, at 12:04 PM, kishore g <g.kishore@gmail.com> wrote:

> probably because of GC pause. 5 is definitely aggressive unless you really need such
a quick fail over you can change the timeout to 15 seconds.
> 
> you probably want to turn on gc logs if you havent already and see if there are long
pauses 
> 
> On Mon, Feb 6, 2012 at 8:51 PM, J Mohamed Zahoor <jmozah@gmail.com> wrote:
> Hi
> 
> I am using s4 0.3.0 version with dynamic configuration (external zookeeper quorum).
> I am encountering application restart when S4 is up for long times..
> It is because of the zookeeper session timeout..
> 
> 
> My timeout in file s4-core.properties is "zk_session_timeout=5000" .
> Below is the log snippet...
> 
> 
> 2012-02-07 06:50:02,769 io.s4.comm.zk.ZkProcessMonitor INFO (ZkProcessMonitor.java:110)
Setting watch on /s4/s4/process
> 2012-02-07 06:50:03,996 org.apache.zookeeper.ClientCnxn INFO (ClientCnxn.java:1118) Unable
to read additional data from server sessionid 0x34cd07724608a3, likely server has closed socket,
closi
> 2012-02-07 06:50:04,097 io.s4.comm.core.DefaultWatcher INFO (DefaultWatcher.java:87)
Received zk event:WatchedEvent state:Disconnected type:None path:null
> 2012-02-07 06:50:04,382 org.apache.zookeeper.ClientCnxn INFO (ClientCnxn.java:1000) Opening
socket connection to server master.ch.bd.net/10.184.17.10:2181
> 2012-02-07 06:50:04,383 org.apache.zookeeper.ClientCnxn INFO (ClientCnxn.java:908) Socket
connection established to master.ch.bd.net/10.184.17.10:2181, initiating session
> 2012-02-07 06:50:04,385 io.s4.comm.core.DefaultWatcher INFO (DefaultWatcher.java:87)
Received zk event:WatchedEvent state:Expired type:None path:null
> 2012-02-07 06:50:04,385 org.apache.zookeeper.ClientCnxn INFO (ClientCnxn.java:1114) Unable
to reconnect to ZooKeeper service, session 0x34cd07724608a7 has expired, closing socket connection
> 2012-02-07 06:50:04,486 io.s4.comm.zk.ZkProcessMonitor WARN (ZkProcessMonitor.java:117)
KeeperException in ProcessMonitor.run
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session
expired for /s4/s4/process
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1271)
>         at io.s4.comm.zk.ZkProcessMonitor.run(ZkProcessMonitor.java:111)
>         at java.lang.Thread.run(Thread.java:636)
> 2012-02-07 06:50:06,079 org.apache.zookeeper.ClientCnxn INFO (ClientCnxn.java:1000) Opening
socket connection to server master.ch.bd.net/10.184.17.10:2181
> 2012-02-07 06:50:06,080 org.apache.zookeeper.ClientCnxn INFO (ClientCnxn.java:908) Socket
connection established to master.ch.bd.net/10.184.17.10:2181, initiating session
> 2012-02-07 06:50:06,081 org.apache.zookeeper.ClientCnxn INFO (ClientCnxn.java:1114) Unable
to reconnect to ZooKeeper service, session 0x34cd07724608a3 has expired, closing socket connection
> 2012-02-07 06:50:06,081 io.s4.comm.core.DefaultWatcher INFO (DefaultWatcher.java:87)
Received zk event:WatchedEvent state:Expired type:None path:null
> 2012-02-07 06:50:06,081 io.s4.listener.CommLayerListener ERROR (CommLayerListener.java:121)
Communication layer broken: source:WatchedEvent state:Expired type:None path:null
> 2012-02-07 06:50:06,081 io.s4.listener.CommLayerListener ERROR (CommLayerListener.java:123)
System exiting so that process can restart.
> 
> 
> any help to prolong this session will be appreciated.
> 
> ./zahoor
> 

Mime
View raw message