Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A5F2D0A0 for ; Mon, 16 Jul 2012 17:40:35 +0000 (UTC) Received: (qmail 34222 invoked by uid 500); 16 Jul 2012 17:40:35 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 34183 invoked by uid 500); 16 Jul 2012 17:40:34 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 34174 invoked by uid 99); 16 Jul 2012 17:40:34 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jul 2012 17:40:34 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 9A9061402B8 for ; Mon, 16 Jul 2012 17:40:34 +0000 (UTC) Date: Mon, 16 Jul 2012 17:40:33 +0000 (UTC) From: "Keith Turner (JIRA)" To: dev@accumulo.apache.org Message-ID: <1050231249.59057.1342460434634.JavaMail.jiratomcat@issues-vm> In-Reply-To: <807853523.45650.1339024283251.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (ACCUMULO-623) Data lost with hdfs write ahead log MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415434#comment-13415434 ] Keith Turner commented on ACCUMULO-623: --------------------------------------- John V and I were discussing this, one possibility is that a tserver will only start if dfs.durable.sync OR dfs.support.append is set to true. This is kinda screwy because at some point the property dfs.support.append (which defaults to false) will go away and the property dfs.durable.sync will appear (which defaults to true). However, I do not think there is a way to determine what a property defaults to in HAdoop, because this is just hardcoded into code that uses the prop. So a user would need to explicitly set dfs.durable.sync to true in their config even though this is the default. See HADOOP-8365. > Data lost with hdfs write ahead log > ----------------------------------- > > Key: ACCUMULO-623 > URL: https://issues.apache.org/jira/browse/ACCUMULO-623 > Project: Accumulo > Issue Type: Bug > Environment: MacOSX, Hadoop 1.0.3, zookeeper 3.3.3 > Reporter: Keith Turner > Assignee: Eric Newton > Priority: Blocker > Fix For: 1.5.0 > > > I shut my machine down with Accumulo, Zookeeper, and HDFS running. When I restarted it, Accumulo failed to recover its write ahead log because it was zero length. I wondered if this was because I shutdown HDFS so I tried the following on my single node Accumulo instance. > * start HDFS and zookeeper > * init & start Accumulo > * created a table and insert some data > * pkill -f java > * restart everything > * Accumulo fails to start because walog is zero length > Saw excpetions like the following > {noformat} > 06 18:58:44,581 [log.SortedLogRecovery] INFO : Looking at mutations from /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7 for !0;!0<< > 06 18:58:44,590 [tabletserver.TabletServer] WARN : exception trying to assign tablet !0;!0<< /root_tablet > java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: Unable to read log entries > at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1458) > at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1295) > at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1134) > at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1121) > at org.apache.accumulo.server.tabletserver.TabletServer$AssignmentHandler.run(TabletServer.java:2477) > at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:680) > Caused by: java.io.IOException: java.lang.RuntimeException: Unable to read log entries > at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:428) > at org.apache.accumulo.server.tabletserver.TabletServer.recover(TabletServer.java:3206) > at org.apache.accumulo.server.tabletserver.Tablet.(Tablet.java:1426) > ... 6 more > Caused by: java.lang.RuntimeException: Unable to read log entries > at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.findLastStartToFinish(SortedLogRecovery.java:125) > at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:89) > at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:426) > ... 8 more > {noformat} > When trying to run LogReader on the files, it prints nothing. > {noformat} > $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7 > 06 19:04:37,147 [util.NativeCodeLoader] WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable > $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/wal/127.0.0.1+40200/def72721-5c64-4755-87cc-2e8cfc3002b7 > $ > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira