hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Krogen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11472) Fix inconsistent replica size after a data pipeline failure
Date Fri, 02 Jun 2017 20:48:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035397#comment-16035397
] 

Erik Krogen commented on HDFS-11472:
------------------------------------

[~jojochuang] no problem! So I was actually wondering if, following the same reasoning as
{{recoverRbwImpl}}, it may be better for {{initReplicaRecoveryImpl}} to check {{blockDataLength}}
if {{bytesOnDisk}} is unexpected, something like this:

{code}
      //check replica bytes on disk.
      long bytesOnDisk = replica.getBytesOnDisk();
      if (bytesOnDisk < replica.getVisibleLength()) {
        long dataLength = replica.getBlockDataLength();
        if (bytesOnDisk != dataLength) {
          LOG.warn("replica recovery: replica.getBytesOnDisk() = " +
              replica.getBytesOnDisk() + " != " +
              "replica.getBlockDataLength() = " + dataLength +
              ", replica = " + replica);
          rip.setLastChecksumAndDataLen(dataLength, null);
        }
        if (replica.getBytesOnDisk() < replica.getVisibleLength()) {
          throw new IOException("THIS IS NOT SUPPOSED TO HAPPEN:"
              + " getBytesOnDisk() < getVisibleLength(), rip=" + replica);
        }
      }
{code}
Do you think this makes sense?

> Fix inconsistent replica size after a data pipeline failure
> -----------------------------------------------------------
>
>                 Key: HDFS-11472
>                 URL: https://issues.apache.org/jira/browse/HDFS-11472
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Critical
>              Labels: release-blocker
>         Attachments: HDFS-11472.001.patch, HDFS-11472.002.patch, HDFS-11472.003.patch,
HDFS-11472.testcase.patch
>
>
> We observed a case where a replica's on disk length is less than acknowledged length,
breaking the assumption in recovery code.
> {noformat}
> 2017-01-08 01:41:03,532 WARN org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol:
Failed to obtain replica info for block (=BP-947993742-10.204.0.136-1362248978912:blk_2526438952_1101394519586)
from datanode (=DatanodeInfoWithStorage[10.204.138.17:1004,null,null])
> java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN: getBytesOnDisk() < getVisibleLength(),
rip=ReplicaBeingWritten, blk_2526438952_1101394519586, RBW
>   getNumBytes()     = 27530
>   getBytesOnDisk()  = 27006
>   getVisibleLength()= 27268
>   getVolume()       = /data/6/hdfs/datanode/current
>   getBlockFile()    = /data/6/hdfs/datanode/current/BP-947993742-10.204.0.136-1362248978912/current/rbw/blk_2526438952
>   bytesAcked=27268
>   bytesOnDisk=27006
>         at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:2284)
>         at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:2260)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:2566)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.callInitReplicaRecovery(DataNode.java:2577)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2645)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.access$400(DataNode.java:245)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode$5.run(DataNode.java:2551)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> It turns out that if an exception is thrown within {{BlockReceiver#receivePacket}}, the
in-memory replica on disk length may not be updated, but the data is written to disk anyway.
> For example, here's one exception we observed
> {noformat}
> 2017-01-08 01:40:59,512 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception
for BP-947993742-10.204.0.136-1362248978912:blk_2526438952_1101394499067
> java.nio.channels.ClosedByInterruptException
>         at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>         at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>         at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.adjustCrcChannelPosition(FsDatasetImpl.java:1484)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.adjustCrcFilePosition(BlockReceiver.java:994)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:670)
>         at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:857)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:797)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> There are potentially other places and causes where an exception is thrown within {{BlockReceiver#receivePacket}},
so it may not make much sense to alleviate it for this particular exception. Instead, we should
improve replica recovery code to handle the case where ondisk size is less than acknowledged
size, and update in-memory checksum accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message