Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D5C8210DFC for ; Wed, 4 Dec 2013 19:30:11 +0000 (UTC) Received: (qmail 67611 invoked by uid 500); 4 Dec 2013 19:30:11 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 67535 invoked by uid 500); 4 Dec 2013 19:30:10 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 67527 invoked by uid 99); 4 Dec 2013 19:30:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Dec 2013 19:30:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of metacret@gmail.com designates 209.85.216.50 as permitted sender) Received: from [209.85.216.50] (HELO mail-qa0-f50.google.com) (209.85.216.50) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Dec 2013 19:30:03 +0000 Received: by mail-qa0-f50.google.com with SMTP id i13so6991929qae.16 for ; Wed, 04 Dec 2013 11:29:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=RuJqSd47NKsvdyiRM+PCuoSL0bDcdh+5OdlZPCfgUiM=; b=SRyGFUwj2RAW/wEvDqKYhe6yLEkbZ7+l8bIWoOkwFF0rnsGruGRyKwvU6s3E8y6M1f 3YnoEI06kvQ8/Xvf6vLdmh3AfVSZXbb1byMG3ZMmdsm/tuz0fRba4szN3jEEuW1G0bv5 JauB3Gmpm+3838HjOY9A4ti1vo4E1/TSHVdFnVrXuNpm5Gg5P1/AoRjvOW3JOdqM3m7T PT8dju+WSAJeBqDD1mCpOMr90TC5Qa6KB1IIOK437R+3fjdcNRtIgvQvO8m/ewhkCfsj IxVHCXS1QljrTCI9Jp5IQgIYbtW0jkNafgvGGgrvwyrcSWuy9a0cQsOLnqMXu94pnL7K jlnA== MIME-Version: 1.0 X-Received: by 10.224.136.136 with SMTP id r8mr112849336qat.0.1386185382695; Wed, 04 Dec 2013 11:29:42 -0800 (PST) Received: by 10.140.23.70 with HTTP; Wed, 4 Dec 2013 11:29:42 -0800 (PST) Date: Wed, 4 Dec 2013 11:29:42 -0800 Message-ID: Subject: Recurring "Failed to process transaction type: 1 error: KeeperErrorCode = NoNode for..." From: "Bae, Jae Hyeon" To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary=001a11c2b266541d6804ecba6f45 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2b266541d6804ecba6f45 Content-Type: text/plain; charset=ISO-8859-1 Hi Zookeeper users When the leader zookeeper instance is killed, theoretically, all followers should restart and elect the new leader gracefully. But I am observing a little frequently, all followers cannot start due to "Failed to process transaction type: 1 error: KeeperErrorCode = NoNode for..." "Unable to load database on disk" So, I had to manually copy the file from the observer to each instance and restart them. What would be the root cause? Even though a few clients are heavily sending the write requests including delete, create, setData, zookeeper shouldn't make the corrupted data correct? Is this disk IO performance or write cache problem? We're running zookeeper on AWS EC2 environment, so we are using the same disk for snapshot directory and log directory, also we didn't turn off disk write cache. Thank you Best, Jae --001a11c2b266541d6804ecba6f45--