Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C3CC59EB6 for ; Wed, 6 Jun 2012 23:11:23 +0000 (UTC) Received: (qmail 62074 invoked by uid 500); 6 Jun 2012 23:11:23 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 62042 invoked by uid 500); 6 Jun 2012 23:11:23 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 62027 invoked by uid 99); 6 Jun 2012 23:11:23 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jun 2012 23:11:23 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 3CE091402B8 for ; Wed, 6 Jun 2012 23:11:23 +0000 (UTC) Date: Wed, 6 Jun 2012 23:11:22 +0000 (UTC) From: "Keith Turner (JIRA)" To: dev@accumulo.apache.org Message-ID: <807853523.45650.1339024283251.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Created] (ACCUMULO-623) Data lost with hdfs write ahead log MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Keith Turner created ACCUMULO-623: ------------------------------------- Summary: Data lost with hdfs write ahead log Key: ACCUMULO-623 URL: https://issues.apache.org/jira/browse/ACCUMULO-623 Project: Accumulo Issue Type: Bug Environment: MacOSX, Hadoop 1.0.3, zookeeper 3.3.3 Reporter: Keith Turner Assignee: Eric Newton Priority: Blocker Fix For: 1.5.0 I shut my machine down with Accumulo, Zookeeper, and HDFS running. When I restarted it, Accumulo failed to recover its write ahead log because it was zero length. I wondered if this was because I shutdown HDFS so I tried the following on my single node Accumulo instance. * start HDFS and zookeeper * init & start Accumulo * created a table and insert some data * pkill -f java * restart everything * Accumulo fails to start because walog is zero length Saw excpetions like the following {noformat} 06 18:58:44,581 [log.SortedLogRecovery] INFO : Looking at mutations from /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7 for !0;!0<< 06 18:58:44,590 [tabletserver.TabletServer] WARN : exception trying to assign tablet !0;!0<< /root_tablet java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: Unable to read log entries at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1458) at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1295) at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1134) at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1121) at org.apache.accumulo.server.tabletserver.TabletServer$AssignmentHandler.run(TabletServer.java:2477) at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) at java.lang.Thread.run(Thread.java:680) Caused by: java.io.IOException: java.lang.RuntimeException: Unable to read log entries at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:428) at org.apache.accumulo.server.tabletserver.TabletServer.recover(TabletServer.java:3206) at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1426) ... 6 more Caused by: java.lang.RuntimeException: Unable to read log entries at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.findLastStartToFinish(SortedLogRecovery.java:125) at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:89) at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:426) ... 8 more {noformat} When trying to run LogReader on the files, it prints nothing. {noformat} $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7 06 19:04:37,147 [util.NativeCodeLoader] WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/wal/127.0.0.1+40200/def72721-5c64-4755-87cc-2e8cfc3002b7 $ {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira