Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 44CDE10DD4 for ; Fri, 1 Nov 2013 06:18:30 +0000 (UTC) Received: (qmail 2840 invoked by uid 500); 1 Nov 2013 06:18:27 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 2787 invoked by uid 500); 1 Nov 2013 06:18:21 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 2779 invoked by uid 500); 1 Nov 2013 06:18:18 -0000 Delivered-To: apmail-hadoop-zookeeper-user@hadoop.apache.org Received: (qmail 2776 invoked by uid 99); 1 Nov 2013 06:18:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Nov 2013 06:18:18 +0000 X-ASF-Spam-Status: No, hits=2.3 required=5.0 tests=SPF_SOFTFAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of rssrik@gmail.com does not designate 216.139.236.26 as permitted sender) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Nov 2013 06:18:12 +0000 Received: from jim.nabble.com ([192.168.236.80]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1Vc83K-0003bg-48 for zookeeper-user@hadoop.apache.org; Thu, 31 Oct 2013 23:17:50 -0700 Date: Thu, 31 Oct 2013 23:17:50 -0700 (PDT) From: Srikanth To: zookeeper-user@hadoop.apache.org Message-ID: <1383286670115-7579223.post@n2.nabble.com> In-Reply-To: References: <1382690403604-7579192.post@n2.nabble.com> Subject: Re: Zookeeper session timeouts during RAID Checks MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Enabled TRACE Logs and set fsync-warningthreshold to 50ms. 1. here's the trace when the pings from client are successful. As soon as the request comes in, commit processor processes it immediately. 2013-10-31 19:07:24,073 [myid:1] - TRACE [FollowerRequestProcessor:1:ZooTrace@90] - :Fsessionid:0x2420f7ae4ba0000 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffff ffe txntype:unknown reqpath:n/a 2013-10-31 19:07:24,073 [myid:1] - DEBUG [FollowerRequestProcessor:1:CommitProcessor@171] - Processing request:: sessionid:0x2420f7ae4ba0000 type:ping cxid:0xffffffffff fffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2013-10-31 19:07:24,073 [myid:1] - DEBUG [CommitProcessor:1:FinalRequestProcessor@88] - Processing request:: sessionid:0x2420f7ae4ba0000 type:ping cxid:0xffffffffffffff fe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2013-10-31 19:07:24,073 [myid:1] - DEBUG [CommitProcessor:1:FinalRequestProcessor@160] - sessionid:0x2420f7ae4ba0000 type:ping cxid:0xfffffffffffffffe zxid:0xffffffffff fffffe txntype:unknown reqpath:n/a 2. This is the trace when the client timesout. There are no fsync warnings during this period. There is a 2 sec interval between the first trace (FollowerRequestProcessor:1:ZooTrace@90) and when commit processor sees the request (FollowerRequestProcessor:1:CommitProcessor@171]). Almost everytime when zkclient timesout, the observation is similar. 2013-10-31 19:07:25,739 [myid:1] - TRACE [FollowerRequestProcessor:1:ZooTrace@90] - :Fsessionid:0x2420f7ae4ba0000 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffff ffe txntype:unknown reqpath:n/a 2013-10-31 19:07:27,518 [myid:1] - DEBUG [FollowerRequestProcessor:1:CommitProcessor@171] - Processing request:: sessionid:0x2420f7ae4ba0000 type:ping cxid:0xffffffffff fffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2013-10-31 19:07:27,518 [myid:1] - DEBUG [CommitProcessor:1:FinalRequestProcessor@88] - Processing request:: sessionid:0x2420f7ae4ba0000 type:ping cxid:0xffffffffffffff fe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2013-10-31 19:07:27,519 [myid:1] - DEBUG [CommitProcessor:1:FinalRequestProcessor@160] - sessionid:0x2420f7ae4ba0000 type:ping cxid:0xfffffffffffffffe zxid:0xffffffffff fffffe txntype:unknown reqpath:n/a 2013-10-31 19:07:27,498 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x2420f7ae4ba0000, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:722) 2013-10-31 19:07:27,519 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /xx.yy.zz.aa:56844 whic h had sessionid 0x2420f7ae4ba0000 Any clues what would cause this 2 sec delay in processing ? thanks, Srikanth -- View this message in context: http://zookeeper-user.578899.n2.nabble.com/Zookeeper-session-timeouts-during-RAID-Checks-tp7579160p7579223.html Sent from the zookeeper-user mailing list archive at Nabble.com.