Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B97CD200C2B for ; Thu, 2 Mar 2017 16:30:31 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id B7FF4160B6F; Thu, 2 Mar 2017 15:30:31 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0D1E0160B61 for ; Thu, 2 Mar 2017 16:30:30 +0100 (CET) Received: (qmail 28240 invoked by uid 500); 2 Mar 2017 15:30:30 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Delivered-To: moderator for dev@zookeeper.apache.org Received: (qmail 73066 invoked by uid 99); 2 Mar 2017 05:37:55 -0000 X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.38 X-Spam-Level: ** X-Spam-Status: No, score=2.38 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=salesforce.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=salesforce.com; s=google; h=mime-version:from:date:message-id:subject:to; bh=0HRLOl+BltruXERSISSqhC4hybNI4ZZypWYWLNVrnfE=; b=iOR8QZOr+JaSKBFpQC7MJ0kN1Vvje51YsMfda5reLO3avZPTc7YPnjf1jVBY56bj7i L6T47LPDk2mUGcwetWAcu2pBCN8wD2VoedC82rXQ0PUANr6L9mGlhUbndabjrZSXwtEE w1yk9noCilEU5pcxyS6qjNBOJZswNhIX9kx10= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=0HRLOl+BltruXERSISSqhC4hybNI4ZZypWYWLNVrnfE=; b=O0sl6jVWL7xitfh6wSxJtuvuUkZH51rRXgwFZ7K5+Dyzmgs4dKEX7u53AQ+BpGVfCl LSLt/pxPegzRxrQjCO7tQH0k+Bn7bmQ/rSkUr3vemITlq9hs/1qWvnLyn8jbOa7AzR9u +8ULOVaSDHDNuBaPEtIVry53nh7Y2tpKTUsyB9n4J7e78c9Cgqp3IN/a4uf61fPeGLuw arrKZkY7Nn6CTCEU6j3O64rgBiBS9s8PSLpmfDODYWMkxgjWeaADHYMMrZjT405/x2B3 vK6TCqoS/Jiodve/Wim+NV8S4RYpZ5VGbhhu14Eh7lwSHop3Eox40NUzthwxAPDWggKK l/DQ== X-Gm-Message-State: AMke39nn4IbYestg0M0IwhryEJriZ2OAayYO5MWI2oM7U4nEt00KUgnRAUD6myBxfNhRz4SJQoGSUc1KnxEGi2LH X-Received: by 10.157.62.203 with SMTP id b69mr5858645otc.30.1488433071053; Wed, 01 Mar 2017 21:37:51 -0800 (PST) MIME-Version: 1.0 From: Andrew Purtell Date: Wed, 1 Mar 2017 21:37:10 -0800 Message-ID: Subject: Partial crash bug described in Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions (FAST17) To: dev@zookeeper.apache.org Content-Type: multipart/alternative; boundary=001a11c023c87918480549b8d4c6 archived-at: Thu, 02 Mar 2017 15:30:31 -0000 --001a11c023c87918480549b8d4c6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Is there a JIRA open for the partial crash bug described in "Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions" Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau, University of Wisconsin=E2=80=94Madison. 15th USENIX Conference on File and Storage Techn= ologies (FAST =E2=80=9917)? From https://www.usenix.org/system/files/conference/fast17/fast17-ganesan.pdf "Unfortunately, ZooKeeper does not recover from write errors to the transaction head and log tail. On write errors during log initialization, the error handling code tries to gracefully shutdown the node but kills only the transaction processing threads; the quorum thread remains alive (partial crash). Consequently, other nodes believe that the leader is healthy and do not elect a new leader. However, since the leader has partially crashed, it cannot propose any transactions, leading to an indefinite write unavailability." --=20 Best regards, Andrew Purtell apurtell@salesforce.com apurtell@apache.org --001a11c023c87918480549b8d4c6--