Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm
Precedence: bulk
Reply-To: jira@apache.org
Date: Wed, 9 Jan 2013 18:56:12 +0000 (UTC)
From: "Eric Newton (JIRA)" <jira@apache.org>
To: notifications@accumulo.apache.org
Message-ID: <JIRA.12601514.1344202747535.108057.1357757772656@arcas>
In-Reply-To: <JIRA.12601514.1344202747535@arcas>
References: <JIRA.12601514.1344202747535@arcas>
Subject: [jira] [Commented] (ACCUMULO-716) Corrupt WAL file
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/ACCUMULO-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13548822#comment-13548822 ] 

Eric Newton commented on ACCUMULO-716:
--------------------------------------

I was able to reproduce this.

 * start accumulo
 * use TestIngest to put some data in
 * kill everything
 * find the last block of the last WAL file in the NameNode logs
 * find the block, and delete the last bunch of bytes
 * start accumulo

{noformat}
org.apache.hadoop.fs.ChecksumException: Checksum error: /blk_5930498692645763206:of:/accumulo/wal/127.0.0.1+9997/25ae29dc-cb3f-4980-93ea-e2099a394382 at 3539456
	org.apache.hadoop.fs.ChecksumException: Checksum error: /blk_5930498692645763206:of:/accumulo/wal/127.0.0.1+9997/25ae29dc-cb3f-4980-93ea-e2099a394382 at 3539456
		at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)
		at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)
		at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
		at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
		at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
		at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1460)
		at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:2175)
		at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2227)
		at java.io.DataInputStream.readFully(DataInputStream.java:178)
		at java.io.DataInputStream.readFully(DataInputStream.java:152)
		at org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:578)
{noformat}

I was trying to see what would happen if the disk-full occurred while trying to write out the checksum data.
                
> Corrupt WAL file
> ----------------
>
>                 Key: ACCUMULO-716
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-716
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>         Environment: java version "1.6.0_33", hadoop-0.20.2-cdh3u3
>            Reporter: Josh Elser
>            Assignee: Eric Newton
>
> Ran wikisearch-ingest. Ended up filling up a drive used by HDFS and things failed not-so-gracefully. Upon restart, log recovery started, appeared to finish (failed HDFS checksum on one WAL entry), and left Accumulo in a state where no tablets were assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira