zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fournier, Camille F. [Tech]" <Camille.Fourn...@gs.com>
Subject RE: tickTime and sessionTimeout
Date Fri, 11 Feb 2011 23:00:21 GMT
What is your ZooKeeper setup here? And do you continue to see this issue if you increase your
session timeout?

C

-----Original Message-----
From: Nick Patania [mailto:Nicholas.Patania@morganstanley.com] 
Sent: Friday, February 11, 2011 5:24 PM
To: user@zookeeper.apache.org
Subject: Re: tickTime and sessionTimeout

Patrick, thanks for your input.

I have rerun the test several times now while logging GC on the server
and running ping from CLIENT_1:

  - No GC happens on the server during the period of interest (a
couple of young generation runs happen before I kill HOST_0, and they
complete in under 3ms).
  - Round trip times for ping from CLIENT_1 are consistently under
250us throughout.

Regarding client GC -- I can consistently reproduce this using a C
client.  Regarding the theory of swapping on CLIENT_1 -- if that were
the cause, the problem wouldn't be 100% reproducible.  I also looked
through the client log for SESSION_1 -- I see "Got ping response ...
after 1ms" repeatedly, followed by "Client session timed out, have not
heard from server in 666ms"...

On Fri, Feb 11, 2011 at 1:46 PM, Patrick Hunt <phunt@apache.org> wrote:
> Those are pretty short timeouts, many sources of delay could be
> causing this. Network jitter/latency, GC/swap (server or client), IO
> write latency, etc... See if any of this might be your issue:
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting
>
> I can clearly see this sequence in your log for session1:
>
> ---
> 2011-02-11 14:18:52,672 -  sessionid:SESSION_1 type:ping
>
> 2011-02-11 14:18:54,502 - Expiring session SESSION_1, timeout of 1000ms exceeded
>
> 2011-02-11 14:18:55,011 - Processing request:: sessionid:SESSION_1 type:ping
> ---
>
> from the looks of it session 1 doesn't send a ping to the server for
> ~2.5 seconds, as a result it's expired.
>
> You should also look at your session 1 client log and see what it's
> view of the world is like. (is it gc/swapping?). Use the "stat" 4
> letter word to see the server's request processing latency, see if
> that's high (higer than the timeout is bad news). Again, checkout the
> troubleshooting guide.
>
> Patrick
>
> ps please use pastebin or attachment, otw the formatting of wrecked
> and it's harder to read the log
>
> On Fri, Feb 11, 2011 at 6:41 AM, Nick Patania
> <Nicholas.Patania@morganstanley.com> wrote:
>> This is the portion that seems relevant.  For readability, I replaced
>> the host and session for the host that I kill with HOST_0 and
>> SESSION_0 (I expect these to timeout).  The client that should be
>> healthy is HOST_1 and SESSION_1.
>>
>>
>> 2011-02-11 14:18:51,901 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>> sessionid:SESSION_0 type:ping cxid:0xfffffffffffffffe
>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:51,901 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_0
>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>> txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:52,005 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:52,005 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>> txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:52,339 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:52,339 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>> txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:52,672 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:52,672 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>> txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:53,002 - INFO  [SessionTracker:ZooKeeperServer@314] -
>> Expiring session SESSION_0, timeout of 1000ms exceeded
>> 2011-02-11 14:18:53,002 - INFO
>> [ProcessThread:-1:PrepRequestProcessor@387] - Processed session
>> termination for sessionid: SESSION_0
>> 2011-02-11 14:18:53,010 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>> sessionid:SESSION_0 type:closeSession cxid:0x0 zxid:0x103 txntype:-11
>> reqpath:n/a
>> 2011-02-11 14:18:53,010 - INFO  [SyncThread:0:NIOServerCnxn@1435] -
>> Closed socket connection for client /HOST_0:34618 which had sessionid
>> SESSION_0
>> 2011-02-11 14:18:54,502 - INFO  [SessionTracker:ZooKeeperServer@314] -
>> Expiring session SESSION_1, timeout of 1000ms exceeded
>> 2011-02-11 14:18:54,502 - INFO
>> [ProcessThread:-1:PrepRequestProcessor@387] - Processed session
>> termination for sessionid: SESSION_1
>> 2011-02-11 14:18:55,011 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>> sessionid:SESSION_1 type:ping cxid:0xfffffffffffffffe
>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:55,011 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@160] - sessionid:SESSION_1
>> type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
>> txntype:unknown reqpath:n/a
>> 2011-02-11 14:18:55,011 - INFO
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn$Factory@251]
>> - Accepted socket connection from /HOST_1:40556
>> 2011-02-11 14:18:55,019 - DEBUG
>> [SyncThread:0:FinalRequestProcessor@78] - Processing request::
>> sessionid:SESSION_1 type:closeSession cxid:0x0 zxid:0x104 txntype:-11
>> reqpath:n/a
>> 2011-02-11 14:18:55,019 - DEBUG
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:ZooKeeperServer@590] -
>> Dropping request: No session with sessionid SESSION_1 exists, probably
>> expired and removed
>> 2011-02-11 14:18:55,019 - INFO  [SyncThread:0:NIOServerCnxn@1435] -
>> Closed socket connection for client /HOST_1:40555 which had sessionid
>> SESSION_1
>> 2011-02-11 14:18:55,020 - DEBUG [SyncThread:0:NIOServerCnxn@1451] -
>> ignoring exception during output shutdown
>> java.net.SocketException: Transport endpoint is not connected
>>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>        at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>        at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1447)
>>        at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1412)
>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSessionWithoutWakeup(NIOServerCnxn.java:343)
>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSession(NIOServerCnxn.java:330)
>>        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:133)
>>        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:161)
>>        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
>> 2011-02-11 14:18:55,021 - DEBUG [SyncThread:0:NIOServerCnxn@1459] -
>> ignoring exception during input shutdown
>> java.net.SocketException: Transport endpoint is not connected
>>        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>        at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>        at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1455)
>>        at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1412)
>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSessionWithoutWakeup(NIOServerCnxn.java:343)
>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.closeSession(NIOServerCnxn.java:330)
>>        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:133)
>>        at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:161)
>>        at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:98)
>> 2011-02-11 14:18:55,022 - WARN
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn$Factory@272]
>> - Ignoring unexpected runtime exception
>> java.nio.channels.CancelledKeyException
>>        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>>        at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:69)
>>        at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:241)
>> 2011-02-11 14:18:55,023 - DEBUG
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@735] -
>> Session establishment request from client /HOST_1:40556 client's
>> lastZxid is 0x0
>> 2011-02-11 14:18:55,023 - INFO
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@770] - Client
>> attempting to renew session SESSION_1 at /HOST_1:40556
>> 2011-02-11 14:18:55,024 - INFO
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@1573] -
>> Invalid session SESSION_1 for client /HOST_1:40556, probably expired
>> 2011-02-11 14:18:55,025 - WARN
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@634] -
>> EndOfStreamException: Unable to read additional data from client
>> sessionid SESSION_1, likely client has closed socket
>> 2011-02-11 14:18:55,025 - INFO
>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:4799:NIOServerCnxn@1435] -
>> Closed socket connection for client /HOST_1:40556 which had sessionid
>> SESSION_1
>>
>>
>> On Thu, Feb 10, 2011 at 8:11 PM, Benjamin Reed <breed@yahoo-inc.com> wrote:
>>> do you see anything in the server log?
>>>
>>> ben
>>>
>>> On 02/10/2011 03:16 PM, Patania, Nick wrote:
>>>>
>>>> Hello,
>>>>
>>>> I run the following test:
>>>>
>>>>
>>>> *         Two clients connect to a zookeeper server; the tickTime on
the
>>>> server is 500, and the sessionTimeout on the client is 1000.
>>>>
>>>> *         Kill the host running one of the clients.
>>>>
>>>> *         The second client receives a session timeout.
>>>>
>>>> Is there any reason why this might happen?
>>>>
>>>> Thanks
>>>> Nick Patania
>>>>
>>>>
>>>> --------------------------------------------------------------------------
>>>> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
>>>> opinions or views contained herein are not intended to be, and do not
>>>> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
>>>> Street Reform and Consumer Protection Act. If you have received this
>>>> communication in error, please destroy all electronic and paper copies and
>>>> notify the sender immediately. Mistransmission is not intended to waive
>>>> confidentiality or privilege. Morgan Stanley reserves the right, to the
>>>> extent permitted under applicable law, to monitor electronic communications.
>>>> This message is subject to terms available at the following link:
>>>> http://www.morganstanley.com/disclaimers. If you cannot access these links,
>>>> please notify us by reply message and we will send the contents to you. By
>>>> messaging with Morgan Stanley you consent to the foregoing.
>>>
>>>
>>
>

Mime
View raw message