accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-623) Data lost with hdfs write ahead log
Date Mon, 16 Jul 2012 17:40:33 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415434#comment-13415434
] 

Keith Turner commented on ACCUMULO-623:
---------------------------------------

John V and I were discussing this, one possibility is that a tserver will only start if dfs.durable.sync
OR dfs.support.append is set to true.  This is kinda screwy because at some point the property
dfs.support.append (which defaults to false) will go away and the property dfs.durable.sync
will appear (which defaults to true).  However, I do not think there is a way to determine
what a property defaults to in HAdoop, because this is just hardcoded into code that uses
the prop.  So a user would need to explicitly set dfs.durable.sync to true in their config
even though this is the default.  See HADOOP-8365.
                
> Data lost with hdfs write ahead log
> -----------------------------------
>
>                 Key: ACCUMULO-623
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-623
>             Project: Accumulo
>          Issue Type: Bug
>         Environment: MacOSX, Hadoop 1.0.3, zookeeper 3.3.3
>            Reporter: Keith Turner
>            Assignee: Eric Newton
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> I shut my machine down with Accumulo, Zookeeper, and HDFS running.  When I restarted
it, Accumulo failed to recover its write ahead log because it was zero length.  I wondered
if this was because I shutdown HDFS so I tried the following on my single node Accumulo instance.
>  * start HDFS and zookeeper
>  * init & start Accumulo
>  * created a table and insert some data
>  * pkill -f java
>  * restart everything
>  * Accumulo fails to start because walog is zero length
> Saw excpetions like the following
> {noformat}
> 06 18:58:44,581 [log.SortedLogRecovery] INFO : Looking at mutations from /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7
for !0;!0<<
> 06 18:58:44,590 [tabletserver.TabletServer] WARN : exception trying to assign tablet
!0;!0<< /root_tablet
> java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: Unable to
read log entries
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1458)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1295)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1134)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1121)
>         at org.apache.accumulo.server.tabletserver.TabletServer$AssignmentHandler.run(TabletServer.java:2477)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:680)
> Caused by: java.io.IOException: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:428)
>         at org.apache.accumulo.server.tabletserver.TabletServer.recover(TabletServer.java:3206)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1426)
>         ... 6 more
> Caused by: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.findLastStartToFinish(SortedLogRecovery.java:125)
>         at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:89)
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:426)
>         ... 8 more
> {noformat}
> When trying to run LogReader on the files, it prints nothing.  
> {noformat}
> $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7
> 06 19:04:37,147 [util.NativeCodeLoader] WARN : Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
> $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/wal/127.0.0.1+40200/def72721-5c64-4755-87cc-2e8cfc3002b7
> $ 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message