Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DE6BF200BB1 for ; Thu, 3 Nov 2016 15:55:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DCE9D160AFE; Thu, 3 Nov 2016 14:55:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0BC88160B0B for ; Thu, 3 Nov 2016 15:54:59 +0100 (CET) Received: (qmail 8499 invoked by uid 500); 3 Nov 2016 14:54:58 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 8367 invoked by uid 99); 3 Nov 2016 14:54:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2016 14:54:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 6CE0F2C2A69 for ; Thu, 3 Nov 2016 14:54:58 +0000 (UTC) Date: Thu, 3 Nov 2016 14:54:58 +0000 (UTC) From: "Flavio Junqueira (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ZOOKEEPER-2619) Client library reconnecting breaks FIFO client order MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 03 Nov 2016 14:55:01 -0000 [ https://issues.apache.org/jira/browse/ZOOKEEPER-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632972#comment-15632972 ] Flavio Junqueira commented on ZOOKEEPER-2619: --------------------------------------------- I think it would be a lot less disruptive for applications to fix ZOOKEEPER-22. Why can't we simply jointly work on getting ZOOKEEPER-22 fixed? > Client library reconnecting breaks FIFO client order > ---------------------------------------------------- > > Key: ZOOKEEPER-2619 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2619 > Project: ZooKeeper > Issue Type: Bug > Reporter: Diego Ongaro > > According to the USENIX ATC 2010 [paper|https://www.usenix.org/conference/usenix-atc-10/zookeeper-wait-free-coordination-internet-scale-systems], ZooKeeper provides "FIFO client order: all requests from a given client are executed in the order that they were sent by the client." I believe applications written using the Java client library are unable to rely on this guarantee, and any current application that does so is broken. Other client libraries are also likely to be affected. > Consider this application, which is simplified from the algorithm described on Page 4 (right column) of the paper: > {code} > zk = new ZooKeeper(...) > zk.createAsync("/data-23857", "...", callback) > zk.createSync("/pointer", "/data-23857") > {code} > Assume an empty ZooKeeper database to begin with and no other writers. Applying the above definition, if the ZooKeeper database contains /pointer, it must also contain /data-23857. > Now consider this series of unfortunate events: > {code} > zk = new ZooKeeper(...) > // The library establishes a TCP connection. > zk.createAsync("/data-23857", "...", callback) > // The library/kernel closes the TCP connection because it times out, and > // the create of /data-23857 is doomed to fail with ConnectionLoss. Suppose > // that it never reaches the server. > // The library establishes a new TCP connection. > zk.createSync("/pointer", "/data-23857") > // The create of /pointer succeeds. > {code} > That's the problem: subsequent operations get assigned to the new connection and succeed, while earlier operations fail. > In general, I believe it's impossible to have a system with the following three properties: > # FIFO client order for asynchronous operations, > # Failing operations when connections are lost, AND > # Transparently reconnecting when connections are lost. > To argue this, consider an application that issues a series of pipelined operations, then upon noticing a connection loss, issues a series of recovery operations, repeating the recovery procedure as necessary. If a pipelined operation fails, all subsequent operations in the pipeline must also fail. Yet the client must also carry on eventually: the recovery operations cannot be trivially failed forever. Unfortunately, the client library does not know where the pipelined operations end and the recovery operations begin. At the time of a connection loss, subsequent pipelined operations may or may not be queued in the library; others might be upcoming in the application thread. If the library re-establishes a connection too early, it will send pipelined operations out of FIFO client order. > I considered a possible workaround of having the client diligently check its callbacks and watchers for connection loss events, and do its best to stop the subsequent pipelined operations at the first sign of a connection loss. In addition to being a large burden for the application, this does not solve the problem all the time. In particular, if the callback thread is delayed significantly (as can happen due to excessive computation or scheduling hiccups), the application may not learn about the connection loss event until after the connection has been re-established and after dependent pipelined operations have already been transmitted over the new connection. > I suggest the following API changes to fix the problem: > - Add a method ZooKeeper.getConnection() returning a ZKConnection object. ZKConnection would wrap a TCP connection. It would include all synchronous and asynchronous operations currently defined on the ZooKeeper class. Upon a connection loss on a ZKConnection, all subsequent operations on the same ZKConnection would return a Connection Loss error. Upon noticing, the client would need to call ZooKeeper.getConnection() again to get a working ZKConnection object, and it would execute its recovery procedure on this new connection. > - Deprecate all asynchronous methods on the ZooKeeper object. These are unsafe to use if the caller assumes they're getting FIFO client order. > - No changes to the protocols or servers are required. > I recognize this could cause a lot of code churn for both ZooKeeper and projects that use it. On the other hand, the existing asynchronous calls in applications should now be audited anyhow. > The code affected by this issue may be difficult to contain: > - It likely affects all ZooKeeper client libraries that provide both asynchronous operations and transparent reconnection. That's probably all versions of the official Java client library, as well as most other client libraries. > - It affects all applications using those libraries that depend on the FIFO client order of asynchronous operations. I don't know how common that is, but the paper implies that FIFO client order is important. > - Fortunately, the issue can only manifest itself when connections are lost and transparently reestablished. In practice, it may also require a long pipeline or a significant delay in the application thread while the library establishes a new connection. > - In case you're wondering, this issue occurred to me while working on a new client library for Go. I haven't seen this issue in the wild, but I was able to produce it locally by placing sleep statements in a Java program and closing its TCP connections. > I'm new to this community, so I'm looking forward to the discussion. Let me know if I can clarify any of the above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)