accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-716) Corrupt WAL file
Date Wed, 09 Jan 2013 18:56:12 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13548822#comment-13548822
] 

Eric Newton commented on ACCUMULO-716:
--------------------------------------

I was able to reproduce this.

 * start accumulo
 * use TestIngest to put some data in
 * kill everything
 * find the last block of the last WAL file in the NameNode logs
 * find the block, and delete the last bunch of bytes
 * start accumulo

{noformat}
org.apache.hadoop.fs.ChecksumException: Checksum error: /blk_5930498692645763206:of:/accumulo/wal/127.0.0.1+9997/25ae29dc-cb3f-4980-93ea-e2099a394382
at 3539456
	org.apache.hadoop.fs.ChecksumException: Checksum error: /blk_5930498692645763206:of:/accumulo/wal/127.0.0.1+9997/25ae29dc-cb3f-4980-93ea-e2099a394382
at 3539456
		at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)
		at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)
		at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
		at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
		at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
		at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1460)
		at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:2175)
		at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2227)
		at java.io.DataInputStream.readFully(DataInputStream.java:178)
		at java.io.DataInputStream.readFully(DataInputStream.java:152)
		at org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:578)
{noformat}

I was trying to see what would happen if the disk-full occurred while trying to write out
the checksum data.
                
> Corrupt WAL file
> ----------------
>
>                 Key: ACCUMULO-716
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-716
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>         Environment: java version "1.6.0_33", hadoop-0.20.2-cdh3u3
>            Reporter: Josh Elser
>            Assignee: Eric Newton
>
> Ran wikisearch-ingest. Ended up filling up a drive used by HDFS and things failed not-so-gracefully.
Upon restart, log recovery started, appeared to finish (failed HDFS checksum on one WAL entry),
and left Accumulo in a state where no tablets were assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message