accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: ERROR RECOVERY ! METADATA
Date Tue, 25 Nov 2014 17:02:04 GMT
Your write-ahead *recovery* logs are unexpectedly truncated.

Can you provide any context for this failure?  What caused the recovery?
What is your HDFS configuration? Is HDFS full?

During recovery, the master discovers that the server hosting tablets has
died. It then asks some other functioning server to sort the logs for those
tablets, mostly by tablet ID, so that recovery of those tablets on other
servers is efficient.  This sorted file is truncated, which makes me
suspect a full file system.

You may be able to recover from your raw write-ahead logs, if:

1) you ensure that HDFS has sufficient space available
2) you have the original write-ahead logs for recovery

-Eric

On Tue, Nov 25, 2014 at 11:25 AM, Josh Elser <josh.elser@gmail.com> wrote:

> Leonardo,
>
> If you lost parts/all of your metadata table, you're going to be in for a
> bit of a rough time. You should be able to get most of your data back;
> however you might see data come back which you had previously deleted and
> data which you recently ingested may be lost.
>
> Check out http://accumulo.apache.org/1.6/accumulo_user_manual.html#
> zookeeper_failure
>
> The basic idea is to take all of the rfiles for each of your tables,
> recreate the table in a new instance, and the import the files into that
> new table. Even though you're running with Accumulo 1.5, the above user
> manual entry should be applicable.
>
> Let us know how we can help.
>
>
> Leonardo Furio wrote:
>
>> Hi , i got this error, could you tell me how can i do to recovery
>> my accumulo table ? If I delete the HDFS file inside the !0 table ?
>> Any idea? I don't want to lose my data...
>>
>>
>> reports assignment failed for tablet !0;!0<<
>>
>> 25 16:07:57,0896tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 128.00 seconds
>>
>> 25 16:07:57,0895tserver:lumlv001.gcio.unicredit.eu8
>> WARN
>>
>> failed to open tablet !0;!0<<  reporting failure to master
>>
>> 25 16:07:57,0894tserver:lumlv001.gcio.unicredit.eu8
>> WARN
>>
>> java.io.IOException: org.apache.commons.collections.
>> BufferUnderflowException
>>
>> 25 16:07:57,0892tserver:lumlv001.gcio.unicredit.eu8
>> WARN
>>
>> exception trying to assign tablet !0;!0<<  /root_tablet
>> java.lang.RuntimeException: java.io.IOException:
>> org.apache.commons.collections.BufferUnderflowException
>> at org.apache.accumulo.server.tabletserver.Tablet.<init>(
>> Tablet.java:1451)
>> at org.apache.accumulo.server.tabletserver.Tablet.<init>(
>> Tablet.java:1300)
>> at org.apache.accumulo.server.tabletserver.Tablet.<init>(
>> Tablet.java:1142)
>> at org.apache.accumulo.server.tabletserver.Tablet.<init>(
>> Tablet.java:1130)
>> at org.apache.accumulo.server.tabletserver.TabletServer$
>> AssignmentHandler.run(TabletServer.java:2509)
>> at org.apache.accumulo.core.util.LoggingRunnable.run(
>> LoggingRunnable.java:34)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.io.IOException:
>> org.apache.commons.collections.BufferUnderflowException
>> at org.apache.accumulo.server.tabletserver.log.
>> TabletServerLogger.recover(TabletServerLogger.java:423)
>> at org.apache.accumulo.server.tabletserver.TabletServer.
>> recover(TabletServer.java:3419)
>> at org.apache.accumulo.server.tabletserver.Tablet.<init>(
>> Tablet.java:1421)
>> ... 6 more
>> Caused by: org.apache.commons.collections.BufferUnderflowException
>> at org.apache.commons.collections.buffer.PriorityBuffer.get(
>> PriorityBuffer.java:264)
>> at org.apache.commons.collections.buffer.PriorityBuffer.remove(
>> PriorityBuffer.java:277)
>> at org.apache.accumulo.server.tabletserver.log.MultiReader.
>> next(MultiReader.java:115)
>> at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.
>> findLastStartToFinish(SortedLogRecovery.java:145)
>> at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.recover(
>> SortedLogRecovery.java:103)
>> at org.apache.accumulo.server.tabletserver.log.
>> TabletServerLogger.recover(TabletServerLogger.java:421)
>> ... 8 more
>>
>> 25 16:06:53,0855tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 64.00 seconds
>>
>> 25 16:06:21,0782tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 32.00 seconds
>>
>> 25 16:06:05,0705tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 16.00 seconds
>>
>> 25 16:05:57,0640tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 8.00 seconds
>>
>> 25 16:05:53,0575tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 4.00 seconds
>>
>> 25 16:05:51,0505tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 2.00 seconds
>>
>> 25 16:05:50,0429tserver:lumlv001.gcio.unicredit.eu1
>> WARN
>>
>> rescheduling tablet load in 1.00 seconds
>>
>

Mime
View raw message