Return-Path: Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: (qmail 51587 invoked from network); 15 Feb 2011 19:11:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Feb 2011 19:11:37 -0000 Received: (qmail 86912 invoked by uid 500); 15 Feb 2011 19:11:36 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 86701 invoked by uid 500); 15 Feb 2011 19:11:34 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 86673 invoked by uid 99); 15 Feb 2011 19:11:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Feb 2011 19:11:33 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Nicholas.Patania@morganstanley.com designates 205.228.53.68 as permitted sender) Received: from [205.228.53.68] (HELO hqmtaint01.ms.com) (205.228.53.68) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Feb 2011 19:11:24 +0000 Received: from hqmtaint01.ms.com (localhost.ms.com [127.0.0.1]) by hqmtaint01.ms.com (output Postfix) with ESMTP id 0DDD9504318 for ; Tue, 15 Feb 2011 14:11:03 -0500 (EST) X-Anti-Virus: Kaspersky Anti-Virus for Linux Mail Server 5.6.44/RELEASE, bases: 20110215 #4818563, check: 20110215 clean Received: from ny0030as01 (unknown [144.203.194.92]) by hqmtaint01.ms.com (internal Postfix) with ESMTP id 0B53050427D for ; Tue, 15 Feb 2011 14:11:03 -0500 (EST) Received: from ny0030as01 (localhost [127.0.0.1]) by ny0030as01 (msa-out Postfix) with ESMTP id E6220AE40B2 for ; Tue, 15 Feb 2011 14:11:02 -0500 (EST) Received: from hqmsaext01.ms.com (hqmsaext01 [205.228.53.70]) by ny0030as01 (mta-in Postfix) with ESMTP id E260F95C001 for ; Tue, 15 Feb 2011 14:11:02 -0500 (EST) Received: from hqmsaext01.ms.com (localhost.ms.com [127.0.0.1]) by hqmsaext01.ms.com (output Postfix) with ESMTP id D2D75146C064 for ; Tue, 15 Feb 2011 14:11:02 -0500 (EST) X-Anti-Virus: Kaspersky Anti-Virus for Linux Mail Server 5.6.44/RELEASE, bases: 20110215 #4818563, check: 20110215 clean Received: from mail-bw0-f41.google.com (mail-bw0-f41.google.com [209.85.214.41]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by hqmsaext01.ms.com (submission Postfix) with ESMTPS id 738B5146C05F for ; Tue, 15 Feb 2011 14:11:02 -0500 (EST) Received: by bwz16 with SMTP id 16so733341bwz.0 for ; Tue, 15 Feb 2011 11:11:01 -0800 (PST) Received: by 10.204.122.198 with SMTP id m6mr26426739bkr.186.1297797061101; Tue, 15 Feb 2011 11:11:01 -0800 (PST) MIME-Version: 1.0 Received: by 10.204.102.132 with HTTP; Tue, 15 Feb 2011 11:10:40 -0800 (PST) In-Reply-To: <69D3016305F9084FBD2C4A0DF189BD5C16B97AAF14@GSCMAMP02EX.firmwide.corp.gs.com> References: <69D3016305F9084FBD2C4A0DF189BD5C16B7F384FD@GSCMAMP02EX.firmwide.corp.gs.com> <69D3016305F9084FBD2C4A0DF189BD5C16B7F384FE@GSCMAMP02EX.firmwide.corp.gs.com> <69D3016305F9084FBD2C4A0DF189BD5C16B7F38506@GSCMAMP02EX.firmwide.corp.gs.com> <69D3016305F9084FBD2C4A0DF189BD5C16B97AAF14@GSCMAMP02EX.firmwide.corp.gs.com> From: Nick Patania Date: Tue, 15 Feb 2011 14:10:40 -0500 Message-ID: Subject: Re: tickTime and sessionTimeout To: user@zookeeper.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org The two "suspicious" traces are full. The full epoll trace is: "NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799" daemon prio=3D10 tid=3D0xaec2b000 nid=3D0x6945 runnable [0xaed65000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0xdfa04348> (a sun.nio.ch.Util$1) - locked <0xdfa04358> (a java.util.Collections$UnmodifiableSet) - locked <0xdfa04308> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerC= nxn.java:232) On Tue, Feb 15, 2011 at 2:06 PM, Fournier, Camille F. [Tech] wrote: > Can you send the full stack trace? > > -----Original Message----- > From: Nick Patania [mailto:Nicholas.Patania@morganstanley.com] > Sent: Tuesday, February 15, 2011 12:16 PM > To: user@zookeeper.apache.org > Subject: Re: tickTime and sessionTimeout > > I did some quick and dirty profiling, and during the period leading to > the expiration of SESSION_1, two of the server's threads are > suspiciously occupied as follows: > > "SyncThread:0" prio=3D10 tid=3D0xaf1cec00 nid=3D0x6947 runnable [0xaebad0= 00] > =A0 java.lang.Thread.State: RUNNABLE > =A0 =A0 =A0 =A0at sun.nio.ch.FileDispatcher.preClose0(Native Method) > =A0 =A0 =A0 =A0at sun.nio.ch.SocketDispatcher.preClose(SocketDispatcher.j= ava:41) > =A0 =A0 =A0 =A0at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel= (SocketChannelImpl.java:684) > =A0 =A0 =A0 =A0- locked <0xdfa07648> (a java.lang.Object) > =A0 =A0 =A0 =A0at java.nio.channels.spi.AbstractSelectableChannel.implClo= seChannel(AbstractSelectableChannel.java:201) > =A0 =A0 =A0 =A0at java.nio.channels.spi.AbstractInterruptibleChannel.clos= e(AbstractInterruptibleChannel.java:97) > =A0 =A0 =A0 =A0- locked <0xdfa075e8> (a java.lang.Object) > =A0 =A0 =A0 =A0at sun.nio.ch.SocketAdaptor.close(SocketAdaptor.java:352) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIO= ServerCnxn.java:1463) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServ= erCnxn.java:1412) > =A0 =A0 =A0 =A0- locked <0xdfa1c0d0> (a java.util.HashSet) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory.close= SessionWithoutWakeup(NIOServerCnxn.java:343) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory.close= Session(NIOServerCnxn.java:330) > =A0 =A0 =A0 =A0- locked <0xdfa04240> (a > org.apache.zookeeper.server.NIOServerCnxn$Factory) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.FinalRequestProcessor.proce= ssRequest(FinalRequestProcessor.java:133) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.SyncRequestProcessor.flush(= SyncRequestProcessor.java:161) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.SyncRequestProcessor.run(Sy= ncRequestProcessor.java:98) > > "NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799" daemon prio=3D10 > tid=3D0xaec2b000 nid=3D0x6945 waiting for monitor entry [ > 0xaed65000] > =A0 java.lang.Thread.State: BLOCKED (on object monitor) > =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(N= IOServerCnxn.java:235) > =A0 =A0 =A0 =A0- waiting to lock <0xdfa04240> (a > org.apache.zookeeper.server.NIOServerCnxn$Factory) > > These two threads are in this state for around 1.66 seconds. =A0Does > this mean something to anyone? > > Note how the second thread "normally" seems to be in epoll_wait: > > "NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799" daemon prio=3D10 > tid=3D0xaec2b000 nid=3D0x6945 runnable [0xaed65000] > =A0 java.lang.Thread.State: RUNNABLE > =A0 =A0 =A0 =A0at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > >> Is the server a virtualized host? I still think this might shed some >> light, what are you seeing for this? before/during/after the event: > > It's not a virtualized host. > > stat prints: > > Latency min/avg/max: 0/0/62 > > But it's 62 (exactly) before, during and after (I run it every 50ms), > which makes me suspicious of the output. > >> It is blocked a >> bit behind the expiring of SESSION_0 due to the synchronous nature of to= uching >> the session table > > That can't possibly take hundreds of milliseconds, can it? > > On Mon, Feb 14, 2011 at 1:32 AM, Patrick Hunt wrote: >> Is the server a virtualized host? I still think this might shed some >> light, what are you seeing for this? before/during/after the event: >> >>>>>> Use the "stat" 4 >>>>>> letter word to see the server's request processing latency, see if >>>>>> that's high (higer than the timeout is bad news). Again, checkout th= e >>>>>> troubleshooting guide. >> >> On Fri, Feb 11, 2011 at 4:32 PM, Fournier, Camille F. [Tech] >> wrote: >>> I think the gist of the problem is that while the ZK Server is expiring= SESSION_0, SESSION_1 is coming in and trying to send a ping. It is blocked= a bit behind the expiring of SESSION_0 due to the synchronous nature of to= uching the session table, and then its request for a ping will be processed= behind the session expiration processing for SESSION_0. So either the expi= rer takes long enough dealing with SESSION_0 that it immediately wants to e= xpire SESSION_1 on next processing, or, the processing of the session expir= ation takes just long enough for SESSION_1 to not get a response to its hea= rtbeat back from the server, which then causes it to disconnect and reconne= ct, and in the interim the server determines the session dead due to timeou= t. >>> >>> Long story short, those timeouts are too short for the server to reliab= ly execute responses in time to guarantee they won't be incorrectly activat= ed. >>> >>> C >>> >>> -----Original Message----- >>> From: Nick Patania [mailto:Nicholas.Patania@morganstanley.com] >>> Sent: Friday, February 11, 2011 6:19 PM >>> To: user@zookeeper.apache.org >>> Subject: Re: tickTime and sessionTimeout >>> >>> 3.3.2-1031432 >>> >>> On Fri, Feb 11, 2011 at 6:15 PM, Fournier, Camille F. [Tech] >>> wrote: >>>> Which version of ZK? >>>> >>>> -----Original Message----- >>>> From: Nick Patania [mailto:Nicholas.Patania@morganstanley.com] >>>> Sent: Friday, February 11, 2011 6:10 PM >>>> To: user@zookeeper.apache.org >>>> Subject: Re: tickTime and sessionTimeout >>>> >>>> A single server. =A0I've intentionally made it trivial to demonstrate >>>> the behavior. >>>> If I increase the timeout, the issue goes away. >>>> >>>> On Fri, Feb 11, 2011 at 6:00 PM, Fournier, Camille F. [Tech] >>>> wrote: >>>>> What is your ZooKeeper setup here? And do you continue to see this is= sue if you increase your session timeout? >>>>> >>>>> C >>>>> >>>>> -----Original Message----- >>>>> From: Nick Patania [mailto:Nicholas.Patania@morganstanley.com] >>>>> Sent: Friday, February 11, 2011 5:24 PM >>>>> To: user@zookeeper.apache.org >>>>> Subject: Re: tickTime and sessionTimeout >>>>> >>>>> Patrick, thanks for your input. >>>>> >>>>> I have rerun the test several times now while logging GC on the serve= r >>>>> and running ping from CLIENT_1: >>>>> >>>>> =A0- No GC happens on the server during the period of interest (a >>>>> couple of young generation runs happen before I kill HOST_0, and they >>>>> complete in under 3ms). >>>>> =A0- Round trip times for ping from CLIENT_1 are consistently under >>>>> 250us throughout. >>>>> >>>>> Regarding client GC -- I can consistently reproduce this using a C >>>>> client. =A0Regarding the theory of swapping on CLIENT_1 -- if that we= re >>>>> the cause, the problem wouldn't be 100% reproducible. =A0I also looke= d >>>>> through the client log for SESSION_1 -- I see "Got ping response ... >>>>> after 1ms" repeatedly, followed by "Client session timed out, have no= t >>>>> heard from server in 666ms"... >>>>> >>>>> On Fri, Feb 11, 2011 at 1:46 PM, Patrick Hunt wrot= e: >>>>>> Those are pretty short timeouts, many sources of delay could be >>>>>> causing this. Network jitter/latency, GC/swap (server or client), IO >>>>>> write latency, etc... See if any of this might be your issue: >>>>>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshootin= g >>>>>> >>>>>> I can clearly see this sequence in your log for session1: >>>>>> >>>>>> --- >>>>>> 2011-02-11 14:18:52,672 - =A0sessionid:SESSION_1 type:ping >>>>>> >>>>>> 2011-02-11 14:18:54,502 - Expiring session SESSION_1, timeout of 100= 0ms exceeded >>>>>> >>>>>> 2011-02-11 14:18:55,011 - Processing request:: sessionid:SESSION_1 t= ype:ping >>>>>> --- >>>>>> >>>>>> from the looks of it session 1 doesn't send a ping to the server for >>>>>> ~2.5 seconds, as a result it's expired. >>>>>> >>>>>> You should also look at your session 1 client log and see what it's >>>>>> view of the world is like. (is it gc/swapping?). Use the "stat" 4 >>>>>> letter word to see the server's request processing latency, see if >>>>>> that's high (higer than the timeout is bad news). Again, checkout th= e >>>>>> troubleshooting guide. >>>>>> >>>>>> Patrick >>>>>> >>>>>> ps please use pastebin or attachment, otw the formatting of wrecked >>>>>> and it's harder to read the log >>>>>> >>>>>> On Fri, Feb 11, 2011 at 6:41 AM, Nick Patania >>>>>> wrote: >>>>>>> This is the portion that seems relevant. =A0For readability, I repl= aced >>>>>>> the host and session for the host that I kill with HOST_0 and >>>>>>> SESSION_0 (I expect these to timeout). =A0The client that should be >>>>>>> healthy is HOST_1 and SESSION_1. >>>>>>> >>>>>>> >>>>>>> 2011-02-11 14:18:51,901 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request:: >>>>>>> sessionid:SESSION_0 type:ping cxid:0xfffffffffffffffe >>>>>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:51,901 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_0 >>>>>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe >>>>>>> txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:52,005 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request:: >>>>>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe >>>>>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:52,005 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1 >>>>>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe >>>>>>> txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:52,339 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request:: >>>>>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe >>>>>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:52,339 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1 >>>>>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe >>>>>>> txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:52,672 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request:: >>>>>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe >>>>>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:52,672 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1 >>>>>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe >>>>>>> txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:53,002 - INFO =A0[SessionTracker:ZooKeeperServer@3= 14] - >>>>>>> Expiring session SESSION_0, timeout of 1000ms exceeded >>>>>>> 2011-02-11 14:18:53,002 - INFO >>>>>>> [ProcessThread:-1:PrepRequestProcessor@387] - Processed session >>>>>>> termination for sessionid: SESSION_0 >>>>>>> 2011-02-11 14:18:53,010 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request:: >>>>>>> sessionid:SESSION_0 type:closeSession cxid:0x0 zxid:0x103 txntype:-= 11 >>>>>>> reqpath:n/a >>>>>>> 2011-02-11 14:18:53,010 - INFO =A0[SyncThread:0:NIOServerCnxn@1435]= - >>>>>>> Closed socket connection for client /HOST_0:34618 which had session= id >>>>>>> SESSION_0 >>>>>>> 2011-02-11 14:18:54,502 - INFO =A0[SessionTracker:ZooKeeperServer@3= 14] - >>>>>>> Expiring session SESSION_1, timeout of 1000ms exceeded >>>>>>> 2011-02-11 14:18:54,502 - INFO >>>>>>> [ProcessThread:-1:PrepRequestProcessor@387] - Processed session >>>>>>> termination for sessionid: SESSION_1 >>>>>>> 2011-02-11 14:18:55,011 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request:: >>>>>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe >>>>>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:55,011 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1 >>>>>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe >>>>>>> txntype:unknown reqpath:n/a >>>>>>> 2011-02-11 14:18:55,011 - INFO >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn$Factory@25= 1] >>>>>>> - Accepted socket connection from /HOST_1:40556 >>>>>>> 2011-02-11 14:18:55,019 - DEBUG >>>>>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request:: >>>>>>> sessionid:SESSION_1 type:closeSession cxid:0x0 zxid:0x104 txntype:-= 11 >>>>>>> reqpath:n/a >>>>>>> 2011-02-11 14:18:55,019 - DEBUG >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:ZooKeeperServer@590] - >>>>>>> Dropping request: No session with sessionid SESSION_1 exists, proba= bly >>>>>>> expired and removed >>>>>>> 2011-02-11 14:18:55,019 - INFO =A0[SyncThread:0:NIOServerCnxn@1435]= - >>>>>>> Closed socket connection for client /HOST_1:40555 which had session= id >>>>>>> SESSION_1 >>>>>>> 2011-02-11 14:18:55,020 - DEBUG [SyncThread:0:NIOServerCnxn@1451] - >>>>>>> ignoring exception during output shutdown >>>>>>> java.net.SocketException: Transport endpoint is not connected >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SocketChannelImpl.shutdown(Native Meth= od) >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SocketChannelImpl.shutdownOutput(Socke= tChannelImpl.java:651) >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAda= ptor.java:368) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn.closeSo= ck(NIOServerCnxn.java:1447) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn.close(N= IOServerCnxn.java:1412) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory= .closeSessionWithoutWakeup(NIOServerCnxn.java:343) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory= .closeSession(NIOServerCnxn.java:330) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.FinalRequestProcessor= .processRequest(FinalRequestProcessor.java:133) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.SyncRequestProcessor.= flush(SyncRequestProcessor.java:161) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.SyncRequestProcessor.= run(SyncRequestProcessor.java:98) >>>>>>> 2011-02-11 14:18:55,021 - DEBUG [SyncThread:0:NIOServerCnxn@1459] - >>>>>>> ignoring exception during input shutdown >>>>>>> java.net.SocketException: Transport endpoint is not connected >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SocketChannelImpl.shutdown(Native Meth= od) >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SocketChannelImpl.shutdownInput(Socket= ChannelImpl.java:640) >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdap= tor.java:360) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn.closeSo= ck(NIOServerCnxn.java:1455) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn.close(N= IOServerCnxn.java:1412) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory= .closeSessionWithoutWakeup(NIOServerCnxn.java:343) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory= .closeSession(NIOServerCnxn.java:330) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.FinalRequestProcessor= .processRequest(FinalRequestProcessor.java:133) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.SyncRequestProcessor.= flush(SyncRequestProcessor.java:161) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.SyncRequestProcessor.= run(SyncRequestProcessor.java:98) >>>>>>> 2011-02-11 14:18:55,022 - WARN >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn$Factory@27= 2] >>>>>>> - Ignoring unexpected runtime exception >>>>>>> java.nio.channels.CancelledKeyException >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SelectionKeyImpl.ensureValid(Selection= KeyImpl.java:55) >>>>>>> =A0 =A0 =A0 =A0at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKey= Impl.java:69) >>>>>>> =A0 =A0 =A0 =A0at org.apache.zookeeper.server.NIOServerCnxn$Factory= .run(NIOServerCnxn.java:241) >>>>>>> 2011-02-11 14:18:55,023 - DEBUG >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@735] - >>>>>>> Session establishment request from client /HOST_1:40556 client's >>>>>>> lastZxid is 0x0 >>>>>>> 2011-02-11 14:18:55,023 - INFO >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@770] - Cli= ent >>>>>>> attempting to renew session SESSION_1 at /HOST_1:40556 >>>>>>> 2011-02-11 14:18:55,024 - INFO >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@1573] - >>>>>>> Invalid session SESSION_1 for client /HOST_1:40556, probably expire= d >>>>>>> 2011-02-11 14:18:55,025 - WARN >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@634] - >>>>>>> EndOfStreamException: Unable to read additional data from client >>>>>>> sessionid SESSION_1, likely client has closed socket >>>>>>> 2011-02-11 14:18:55,025 - INFO >>>>>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@1435] - >>>>>>> Closed socket connection for client /HOST_1:40556 which had session= id >>>>>>> SESSION_1 >>>>>>> >>>>>>> >>>>>>> On Thu, Feb 10, 2011 at 8:11 PM, Benjamin Reed wrote: >>>>>>>> do you see anything in the server log? >>>>>>>> >>>>>>>> ben >>>>>>>> >>>>>>>> On 02/10/2011 03:16 PM, Patania, Nick wrote: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I run the following test: >>>>>>>>> >>>>>>>>> >>>>>>>>> * =A0 =A0 =A0 =A0 Two clients connect to a zookeeper server; the = tickTime on the >>>>>>>>> server is 500, and the sessionTimeout on the client is 1000. >>>>>>>>> >>>>>>>>> * =A0 =A0 =A0 =A0 Kill the host running one of the clients. >>>>>>>>> >>>>>>>>> * =A0 =A0 =A0 =A0 The second client receives a session timeout. >>>>>>>>> >>>>>>>>> Is there any reason why this might happen? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Nick Patania >>>>>>>>> >>>>>>>>> >>>>>>>>> -----------------------------------------------------------------= --------- >>>>>>>>> NOTICE: Morgan Stanley is not acting as a municipal advisor and t= he >>>>>>>>> opinions or views contained herein are not intended to be, and do= not >>>>>>>>> constitute, advice within the meaning of Section 975 of the Dodd-= Frank Wall >>>>>>>>> Street Reform and Consumer Protection Act. If you have received t= his >>>>>>>>> communication in error, please destroy all electronic and paper c= opies and >>>>>>>>> notify the sender immediately. Mistransmission is not intended to= waive >>>>>>>>> confidentiality or privilege. Morgan Stanley reserves the right, = to the >>>>>>>>> extent permitted under applicable law, to monitor electronic comm= unications. >>>>>>>>> This message is subject to terms available at the following link: >>>>>>>>> http://www.morganstanley.com/disclaimers. If you cannot access th= ese links, >>>>>>>>> please notify us by reply message and we will send the contents t= o you. By >>>>>>>>> messaging with Morgan Stanley you consent to the foregoing. >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >