hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3874) Exception when client reports bad checksum to NN
Date Thu, 30 Aug 2012 22:15:07 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445350#comment-13445350
] 

Todd Lipcon commented on HDFS-3874:
-----------------------------------

The bug seems to be that the datanode doesn't report the right remote DN when it detects a
checksum error when receiving a block. Here are the DN side logs:

{code}
2012-08-27 16:34:30,396 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Checksum error
in block BP-1507505631-172.29.97.196-1337120439433:blk_8285012733733669474_140475196 from
/172.29.97.219:52544
org.apache.hadoop.fs.ChecksumException: Checksum error: DFSClient_NONMAPREDUCE_334070927_1
at 44032 exp: -983390667 got: 557443094
        at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:335)
        at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:266)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.verifyChunks(BlockReceiver.java:377)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:496)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:506)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
        at java.lang.Thread.run(Thread.java:662)
2012-08-27 16:34:30,396 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: report corrupt
block BP-1507505631-172.29.97.196-1337120439433:blk_8285012733733669474_140475196 from datanode
:0 to namenode
{code}
                
> Exception when client reports bad checksum to NN
> ------------------------------------------------
>
>                 Key: HDFS-3874
>                 URL: https://issues.apache.org/jira/browse/HDFS-3874
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client, name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>
> We see the following exception in our logs on a cluster:
> {code}
> 2012-08-27 16:34:30,400 INFO org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.reportBadBlocks
> 2012-08-27 16:34:30,400 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:hdfs (auth:SIMPLE) cause:java.io.IOException: Cannot mark blk_8285012733733669474_140475196{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.29.97.219:50010|RBW]]}(same as
stored) as corrupt because datanode :0 does not exist
> 2012-08-27 16:34:30,400 INFO org.apache.hadoop.ipc.Server: IPC Server handler 46 on 8020,
call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.reportBadBlocks from 172.29.97.219:43805:
error: java.io.IOException: Cannot mark blk_8285012733733669474_140475196{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.29.97.219:50010|RBW]]}(same as
stored) as corrupt because datanode :0 does not exist
> java.io.IOException: Cannot mark blk_8285012733733669474_140475196{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.29.97.219:50010|RBW]]}(same as
stored) as corrupt because datanode :0 does not exist
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.markBlockAsCorrupt(BlockManager.java:1001)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.findAndMarkBlockAsCorrupt(BlockManager.java:994)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.reportBadBlocks(FSNamesystem.java:4736)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.reportBadBlocks(NameNodeRpcServer.java:537)
>         at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.reportBadBlocks(DatanodeProtocolServerSideTranslatorPB.java:242)
>         at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:20032)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message