Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DBBC7200C86 for ; Wed, 17 May 2017 00:23:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id DA52E160BC9; Tue, 16 May 2017 22:23:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2B789160BC1 for ; Wed, 17 May 2017 00:23:09 +0200 (CEST) Received: (qmail 96515 invoked by uid 500); 16 May 2017 22:23:08 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 96504 invoked by uid 99); 16 May 2017 22:23:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 May 2017 22:23:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 9F5ECC6AE3 for ; Tue, 16 May 2017 22:23:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id CRwePC8DLxKc for ; Tue, 16 May 2017 22:23:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 04B815FC16 for ; Tue, 16 May 2017 22:23:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 1908BE09D6 for ; Tue, 16 May 2017 22:23:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1C66D21941 for ; Tue, 16 May 2017 22:23:04 +0000 (UTC) Date: Tue, 16 May 2017 22:23:04 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 16 May 2017 22:23:10 -0000 [ https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013200#comment-16013200 ] ASF GitHub Bot commented on ZOOKEEPER-2775: ------------------------------------------- Github user arshadmohammad commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/254#discussion_r116872750 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java --- @@ -1080,6 +1080,8 @@ private void startConnect() throws IOException { zooKeeperSaslClient.shutdown(); } zooKeeperSaslClient = new ZooKeeperSaslClient(getServerPrincipal(addr), clientConfig); + // SASL login succeeded + saslLoginFailed = false; --- End diff -- this change has impact on tunnelAuthInProgress. But yes, we should init the variable on new connection start as this is tunnelAuthInProgress logic expects. > ZK Client not able to connect with Xid out of order error > ---------------------------------------------------------- > > Key: ZOOKEEPER-2775 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775 > Project: ZooKeeper > Issue Type: Bug > Components: java client > Affects Versions: 3.4.10, 3.5.3, 3.6.0 > Reporter: Bhupendra Kumar Jain > Assignee: Mohammad Arshad > Priority: Critical > Attachments: ZOOKEEPER-2775-01.patch > > > During Network unreachable scenario in one of the cluster, we observed Xid out of order and Nothing in the queue error continously. And ZK client it finally not able to connect successully to ZK server. > *Logs:* > unexpected error, closing socket connection and attempting reconnect | org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) > java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 for a packet with details: clientPath:null serverPath:null finished:false header:: 53,101 replyHeader:: 0,0,-4 request:: 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes} response:: null > at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996) > at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > unexpected error, closing socket connection and attempting reconnect > java.io.IOException: Nothing in the queue, but got 1 > at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983) > at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > > *Analysis:* > 1) First time Client fails to do SASL login due to network unreachable problem. > 2017-03-29 10:03:59,377 | WARN | [main-SendThread(192.168.130.8:24002)] | SASL configuration failed: javax.security.auth.login.LoginException: Network is unreachable (sendto failed) Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it. | org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) > Here the boolean saslLoginFailed becomes true. > 2) After some time network connection is recovered and client is successully able to login but still the boolean saslLoginFailed is not reset to false. > 3) Now SASL negotiation between client and server start happening and during this time no user request will be sent. ( As the socket channel will be closed for write till sasl negotiation complets) > 4) Now response from server for SASL packet will be processed by the client and client assumes that tunnelAuthInProgress() is finished ( method checks for saslLoginFailed boolean Since the boolean is true it assumes its done.) and tries to process the packet as a other packet and will result in above errors. > *Solution:* Reset the saslLoginFailed boolean every time before client login -- This message was sent by Atlassian JIRA (v6.3.15#6346)