hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3770) TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed
Date Tue, 07 Aug 2012 02:57:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429679#comment-13429679
] 

Eli Collins commented on HDFS-3770:
-----------------------------------

Here's the relevant portion of the log:

Exception in thread "Thread-2125" java.lang.RuntimeException: org.apache.hadoop.fs.ChecksumException:
Checksum error: /block-being-written-to at 1072128 exp: 1082174632 got: -132500175
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$4.run(TestFileConcurrentReader.java:383)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: /block-being-written-to
at 1072128 exp: 1082174632 got: -132500175
	at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:297)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.verifyPacketChecksums(RemoteBlockReader2.java:221)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:191)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:130)
	at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:526)
	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:578)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:632)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:673)
	at java.io.DataInputStream.read(DataInputStream.java:83)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader.tailFile(TestFileConcurrentReader.java:440)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader.access$200(TestFileConcurrentReader.java:54)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$4.run(TestFileConcurrentReader.java:379)
	... 1 more
Exception in thread "Thread-2124" java.lang.RuntimeException: java.io.InterruptedIOException:
Interrupted while waiting for data to be acknowledged by pipeline
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$3.run(TestFileConcurrentReader.java:367)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.InterruptedIOException: Interrupted while waiting for data to be acknowledged
by pipeline
	at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:1649)
	at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:1633)
	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1718)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:99)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$3.run(TestFileConcurrentReader.java:363)

And this as well..

2012-08-06 23:38:14,373 INFO  hdfs.StateChange (FSNamesystem.java:reportBadBlocks(4727)) -
*DIR* NameNode.reportBadBlocks
2012-08-06 23:38:14,374 INFO  hdfs.StateChange (CorruptReplicasMap.java:addToCorruptReplicasMap(66))
- BLOCK NameSystem.addToCorruptReplicasMap: blk_4844811661965065785 added as corrupt on 127.0.0.1:33823
by /127.0.0.1 because client machine reported it
2012-08-06 23:38:14,375 ERROR hdfs.TestFileConcurrentReader (TestFileConcurrentReader.java:run(381))
- error tailing file /block-being-written-to
org.apache.hadoop.fs.ChecksumException: Checksum error: /block-being-written-to at 1072128
exp: 1082174632 got: -132500175
	at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:297)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.verifyPacketChecksums(RemoteBlockReader2.java:221)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:191)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:130)
	at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:526)
	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:578)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:632)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:673)
	at java.io.DataInputStream.read(DataInputStream.java:83)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader.tailFile(TestFileConcurrentReader.java:440)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader.access$200(TestFileConcurrentReader.java:54)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$4.run(TestFileConcurrentReader.java:379)
	at java.lang.Thread.run(Thread.java:662)
2012-08-06 23:38:14,376 ERROR hdfs.TestFileConcurrentReader (TestFileConcurrentReader.java:run(393))
- error in tailer
java.lang.RuntimeException: org.apache.hadoop.fs.ChecksumException: Checksum error: /block-being-written-to
at 1072128 exp: 1082174632 got: -132500175
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$4.run(TestFileConcurrentReader.java:383)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: /block-being-written-to
at 1072128 exp: 1082174632 got: -132500175
	at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:297)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.verifyPacketChecksums(RemoteBlockReader2.java:221)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:191)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:130)
	at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:526)
	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:578)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:632)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:673)
	at java.io.DataInputStream.read(DataInputStream.java:83)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader.tailFile(TestFileConcurrentReader.java:440)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader.access$200(TestFileConcurrentReader.java:54)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$4.run(TestFileConcurrentReader.java:379)
	... 1 more
2012-08-06 23:38:14,377 INFO  FSNamesystem.audit (FSNamesystem.java:logAuditEvent(267)) -
allowed=true	ugi=jenkins (auth:SIMPLE)	ip=/127.0.0.1	cmd=append	src=/block-being-written-to
dst=null	perm=null
2012-08-06 23:38:14,377 DEBUG hdfs.DFSClient (DFSOutputStream.java:computePacketChunkSize(1329))
- computePacketChunkSize: src=/block-being-written-to, chunkSize=442, chunksPerPacket=1, packetSize=475
2012-08-06 23:38:14,378 DEBUG hdfs.DFSClient (DFSOutputStream.java:queueCurrentPacket(1342))
- Queued packet 0
2012-08-06 23:38:14,378 DEBUG hdfs.DFSClient (DFSOutputStream.java:waitForAckedSeqno(1638))
- Waiting for ack for: 0
2012-08-06 23:38:14,378 DEBUG hdfs.DFSClient (DFSOutputStream.java:run(466)) - Append to block
BP-402451742-67.195.138.20-1344296267847:blk_4844811661965065785_2099
2012-08-06 23:38:14,378 ERROR hdfs.TestFileConcurrentReader (TestFileConcurrentReader.java:run(365))
- error in writer
java.io.InterruptedIOException: Interrupted while waiting for data to be acknowledged by pipeline
	at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:1649)
	at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:1633)
	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1718)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:99)
	at org.apache.hadoop.hdfs.TestFileConcurrentReader$3.run(TestFileConcurrentReader.java:363)
	at java.lang.Thread.run(Thread.java:662)
2012-08-06 23:38:14,379 INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1317)) - Shutting
down the Mini HDFS Cluster

                
> TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-3770
>                 URL: https://issues.apache.org/jira/browse/HDFS-3770
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0
>            Reporter: Eli Collins
>
> TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed on [a recent
job|https://builds.apache.org/job/PreCommit-HDFS-Build/2959]. Looks like a race in the test.
The failure is due to a ChecksumException but that's likely due to the DFSOutputstream getting
interrupted on close. Looking at the relevant code, waitForAckedSeqno is getting an InterruptedException
waiting on dataQueue, looks like there are uses of interrupt where we're not first notifying
dataQueue, or waiting for the notifications to be delivered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message