Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6E6C0200BF4 for ; Fri, 6 Jan 2017 21:39:03 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 6D0B1160B39; Fri, 6 Jan 2017 20:39:03 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B88D8160B37 for ; Fri, 6 Jan 2017 21:39:02 +0100 (CET) Received: (qmail 12890 invoked by uid 500); 6 Jan 2017 20:39:01 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 12873 invoked by uid 99); 6 Jan 2017 20:39:01 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jan 2017 20:39:01 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 18A72C023E for ; Fri, 6 Jan 2017 20:39:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.398 X-Spam-Level: ** X-Spam-Status: No, score=2.398 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 0EHeuGYO1GOO for ; Fri, 6 Jan 2017 20:39:00 +0000 (UTC) Received: from mail-oi0-f43.google.com (mail-oi0-f43.google.com [209.85.218.43]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 4554D5F1B8 for ; Fri, 6 Jan 2017 20:38:59 +0000 (UTC) Received: by mail-oi0-f43.google.com with SMTP id 3so445385723oih.1 for ; Fri, 06 Jan 2017 12:38:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=sRKk4yfrhukKgRMU1rcOxO7g3G0TR5UKJz3J3eY5yJw=; b=RlPwdirfXqy5Tc8LpuuI0IJ0EPTJvezlr54S6jCH7q5XHJ5qfvjUcCRj9QHfwxzho+ vyLG7rfUzVVqxYOuluqfNeLCqf/wNRALnC37QjvBsup9cwa80u2CDFpVOx6tVNkIfC0q SkP8PHD44DKsy0Q5pm1xzrMs66VC5U54BiR4XUupgX3GvqehYQOMWTe/zFl2YapaSxmJ NJjQK1/7OaD2PxHv/4nWt+9N2ZYvevb9qPhmhCu/jyNR2u4llC+fnc1jcxjtNT9odQ9n YJTTOT2QmJWobsvz20OqPHpMXEMwTm2iowlFjsmDQPgO1Ll3ZRfy8euc7RUVkC99rnwx d9ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=sRKk4yfrhukKgRMU1rcOxO7g3G0TR5UKJz3J3eY5yJw=; b=kKPSlxePF2dh4c729GLJzqcSiYI+YdCsGsVnf7dPltN4z5ZnyQKcf0jLFK6i7D0D3E u8yJ2WXlr1NIj4/Bms79hbzuYwEvLXXhtqvcLhUOUWICBJ5AzEZMTBkdJcmFIgallGCW kzh+l7i6NxZX/dDtOsCFbzXC4sObKR0BESy7PkuRI8DCMMwcHAoONVn3+UcwbPqFYXfu sGWof87ICMdjjH7CizT5K34gZeADZX3E1mp/7F4InQcfIQv4XaCq3kxW9czqBDI98hs7 zE1jz5Qx+PkBLXQrpEf/Ha6NKkK/NFnfiGJX0Pk8J4uLMlozocesJ/zoLXEfHDzf2CPQ 2Uxw== X-Gm-Message-State: AIkVDXIZ6jErn849PVJNjEhEuo45dHjUPpiAnbdWJANLWh0BJAG9Hs4F+ZASStOaHVpJnO+o3OIVzhfcg3rSVQ== X-Received: by 10.157.23.208 with SMTP id j74mr1900415otj.266.1483735138031; Fri, 06 Jan 2017 12:38:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.74.91.2 with HTTP; Fri, 6 Jan 2017 12:38:57 -0800 (PST) From: Aishwarya Ganesan Date: Fri, 6 Jan 2017 14:38:57 -0600 Message-ID: Subject: Crash on detecting a corruption To: dev@zookeeper.apache.org Content-Type: multipart/alternative; boundary=94eb2c0916ccd7dc2f054573016c archived-at: Fri, 06 Jan 2017 20:39:03 -0000 --94eb2c0916ccd7dc2f054573016c Content-Type: text/plain; charset=UTF-8 Hi, We are looking at how ZooKeeper handles silent data corruptions resulting from underlying problems in disks and file systems atop them [1,2]. We set up a 3-node ZooKeeper cluster and introduce silent data corruptions to different blocks in the on-disk files. In all the cases, ZooKeeper is able to detect corruptions in the log file using checksums. However, on detecting a corruption, the ZooKeeper node in which corruption occurred crashes instead of trying to fix the corrupted data automatically using the replicas. Why does ZooKeeper not fix the corrupted entry automatically using replicas? What is the reason for this design decision? It would be helpful if anyone could give some insights on this. [1] https://research.cs.wisc.edu/wind/Publications/zfs-corruption-fast10.pdf [2] http://www.cs.toronto.edu/~bianca/papers/fast08.pdf Thanks, Aishwarya --94eb2c0916ccd7dc2f054573016c--