zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fournier, Camille F. [Tech]" <Camille.Fourn...@gs.com>
Subject RE: tickTime and sessionTimeout
Date Fri, 11 Feb 2011 23:15:03 GMT
Which version of ZK?

-----Original Message-----
From: Nick Patania [mailto:Nicholas.Patania@morganstanley.com] 
Sent: Friday, February 11, 2011 6:10 PM
To: user@zookeeper.apache.org
Subject: Re: tickTime and sessionTimeout

A single server.  I've intentionally made it trivial to demonstrate
the behavior.
If I increase the timeout, the issue goes away.

On Fri, Feb 11, 2011 at 6:00 PM, Fournier, Camille F. [Tech]
<Camille.Fournier@gs.com> wrote:
> What is your ZooKeeper setup here? And do you continue to see this issue if you increase
your session timeout?
>
> C
>
> -----Original Message-----
> From: Nick Patania [mailto:Nicholas.Patania@morganstanley.com]
> Sent: Friday, February 11, 2011 5:24 PM
> To: user@zookeeper.apache.org
> Subject: Re: tickTime and sessionTimeout
>
> Patrick, thanks for your input.
>
> I have rerun the test several times now while logging GC on the server
> and running ping from CLIENT_1:
>
>  - No GC happens on the server during the period of interest (a
> couple of young generation runs happen before I kill HOST_0, and they
> complete in under 3ms).
>  - Round trip times for ping from CLIENT_1 are consistently under
> 250us throughout.
>
> Regarding client GC -- I can consistently reproduce this using a C
> client.  Regarding the theory of swapping on CLIENT_1 -- if that were
> the cause, the problem wouldn't be 100% reproducible.  I also looked
> through the client log for SESSION_1 -- I see "Got ping response ...
> after 1ms" repeatedly, followed by "Client session timed out, have not
> heard from server in 666ms"...
>
> On Fri, Feb 11, 2011 at 1:46 PM, Patrick Hunt <phunt@apache.org> wrote:
>> Those are pretty short timeouts, many sources of delay could be
>> causing this. Network jitter/latency, GC/swap (server or client), IO
>> write latency, etc... See if any of this might be your issue:
>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting
>>
>> I can clearly see this sequence in your log for session1:
>>
>> ---
>> 2011-02-11 14:18:52,672 -  sessionid:SESSION_1 type:ping
>>
>> 2011-02-11 14:18:54,502 - Expiring session SESSION_1, timeout of 1000ms exceeded
>>
>> 2011-02-11 14:18:55,011 - Processing request:: sessionid:SESSION_1 type:ping
>> ---
>>
>> from the looks of it session 1 doesn't send a ping to the server for
>> ~2.5 seconds, as a result it's expired.
>>
>> You should also look at your session 1 client log and see what it's
>> view of the world is like. (is it gc/swapping?). Use the "stat" 4
>> letter word to see the server's request processing latency, see if
>> that's high (higer than the timeout is bad news). Again, checkout the
>> troubleshooting guide.
>>
>> Patrick
>>
>> ps please use pastebin or attachment, otw the formatting of wrecked
>> and it's harder to read the log
>>
>> On Fri, Feb 11, 2011 at 6:41 AM, Nick Patania
>> <Nicholas.Patania@morganstanley.com> wrote:
>>> This is the portion that seems relevant.  For readability, I replaced
>>> the host and session for the host that I kill with HOST_0 and
>>> SESSION_0 (I expect these to timeout).  The client that should be
>>> healthy is HOST_1 and SESSION_1.
>>>
>>>
>>> 2011-02-11 14:18:51,901 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>>> sessionid:SESSION_0 type:ping cxid:0xfffffffffffffffe
>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:51,901 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_0
>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>>> txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:52,005 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:52,005 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>>> txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:52,339 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:52,339 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>>> txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:52,672 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:52,672 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>>> txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:53,002 - INFO  [SessionTracker:ZooKeeperServer@314] -
>>> Expiring session SESSION_0, timeout of 1000ms exceeded
>>> 2011-02-11 14:18:53,002 - INFO
>>> [ProcessThread:-1:PrepRequestProcessor@387] - Processed session
>>> termination for sessionid: SESSION_0
>>> 2011-02-11 14:18:53,010 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>>> sessionid:SESSION_0 type:closeSession cxid:0x0 zxid:0x103 txntype:-11
>>> reqpath:n/a
>>> 2011-02-11 14:18:53,010 - INFO  [SyncThread:0:NIOServerCnxn@1435] -
>>> Closed socket connection for client /HOST_0:34618 which had sessionid
>>> SESSION_0
>>> 2011-02-11 14:18:54,502 - INFO  [SessionTracker:ZooKeeperServer@314] -
>>> Expiring session SESSION_1, timeout of 1000ms exceeded
>>> 2011-02-11 14:18:54,502 - INFO
>>> [ProcessThread:-1:PrepRequestProcessor@387] - Processed session
>>> termination for sessionid: SESSION_1
>>> 2011-02-11 14:18:55,011 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:55,011 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>>> txntype:unknown reqpath:n/a
>>> 2011-02-11 14:18:55,011 - INFO
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn$Factory@251]
>>> - Accepted socket connection from /HOST_1:40556
>>> 2011-02-11 14:18:55,019 - DEBUG
>>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>>> sessionid:SESSION_1 type:closeSession cxid:0x0 zxid:0x104 txntype:-11
>>> reqpath:n/a
>>> 2011-02-11 14:18:55,019 - DEBUG
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:ZooKeeperServer@590] -
>>> Dropping request: No session with sessionid SESSION_1 exists, probably
>>> expired and removed
>>> 2011-02-11 14:18:55,019 - INFO  [SyncThread:0:NIOServerCnxn@1435] -
>>> Closed socket connection for client /HOST_1:40555 which had sessionid
>>> SESSION_1
>>> 2011-02-11 14:18:55,020 - DEBUG [SyncThread:0:NIOServerCnxn@1451] -
>>> ignoring exception during output shutdown
>>> java.net.SocketException: Transport endpoint is not connected
>>>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>>        at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>        at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1447)
>>>        at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1412)
>>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSessionWithoutWakeup(NIOServerCnxn.java:343)
>>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSession(NIOServerCnxn.java:330)
>>>        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:133)
>>>        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:161)
>>>        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
>>> 2011-02-11 14:18:55,021 - DEBUG [SyncThread:0:NIOServerCnxn@1459] -
>>> ignoring exception during input shutdown
>>> java.net.SocketException: Transport endpoint is not connected
>>>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>>        at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>        at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1455)
>>>        at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1412)
>>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSessionWithoutWakeup(NIOServerCnxn.java:343)
>>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSession(NIOServerCnxn.java:330)
>>>        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:133)
>>>        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:161)
>>>        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
>>> 2011-02-11 14:18:55,022 - WARN
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn$Factory@272]
>>> - Ignoring unexpected runtime exception
>>> java.nio.channels.CancelledKeyException
>>>        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>>>        at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:69)
>>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:241)
>>> 2011-02-11 14:18:55,023 - DEBUG
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@735] -
>>> Session establishment request from client /HOST_1:40556 client's
>>> lastZxid is 0x0
>>> 2011-02-11 14:18:55,023 - INFO
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@770] - Client
>>> attempting to renew session SESSION_1 at /HOST_1:40556
>>> 2011-02-11 14:18:55,024 - INFO
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@1573] -
>>> Invalid session SESSION_1 for client /HOST_1:40556, probably expired
>>> 2011-02-11 14:18:55,025 - WARN
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@634] -
>>> EndOfStreamException: Unable to read additional data from client
>>> sessionid SESSION_1, likely client has closed socket
>>> 2011-02-11 14:18:55,025 - INFO
>>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@1435] -
>>> Closed socket connection for client /HOST_1:40556 which had sessionid
>>> SESSION_1
>>>
>>>
>>> On Thu, Feb 10, 2011 at 8:11 PM, Benjamin Reed <breed@yahoo-inc.com> wrote:
>>>> do you see anything in the server log?
>>>>
>>>> ben
>>>>
>>>> On 02/10/2011 03:16 PM, Patania, Nick wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I run the following test:
>>>>>
>>>>>
>>>>> *         Two clients connect to a zookeeper server; the tickTime
on the
>>>>> server is 500, and the sessionTimeout on the client is 1000.
>>>>>
>>>>> *         Kill the host running one of the clients.
>>>>>
>>>>> *         The second client receives a session timeout.
>>>>>
>>>>> Is there any reason why this might happen?
>>>>>
>>>>> Thanks
>>>>> Nick Patania
>>>>>
>>>>>
>>>>> --------------------------------------------------------------------------
>>>>> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
>>>>> opinions or views contained herein are not intended to be, and do not
>>>>> constitute, advice within the meaning of Section 975 of the Dodd-Frank
Wall
>>>>> Street Reform and Consumer Protection Act. If you have received this
>>>>> communication in error, please destroy all electronic and paper copies
and
>>>>> notify the sender immediately. Mistransmission is not intended to waive
>>>>> confidentiality or privilege. Morgan Stanley reserves the right, to the
>>>>> extent permitted under applicable law, to monitor electronic communications.
>>>>> This message is subject to terms available at the following link:
>>>>> http://www.morganstanley.com/disclaimers. If you cannot access these
links,
>>>>> please notify us by reply message and we will send the contents to you.
By
>>>>> messaging with Morgan Stanley you consent to the foregoing.
>>>>
>>>>
>>>
>>
>

Mime
View raw message