hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-13243) Get CorruptBlock because of calling close and sync in same time
Date Thu, 08 Mar 2018 12:40:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391178#comment-16391178
] 

Wei-Chiu Chuang edited comment on HDFS-13243 at 3/8/18 12:39 PM:
-----------------------------------------------------------------

Hi [~gzh1992n] thanks very much for reporting the issue. The NN log is useful too.

I looked at the patch & log and tried to understand where the problem is. I don't think
I understand the problem fully, but here are some of the thoughts that I'd like to share with
you.
 # It seems to me the root of problem is that client would call fsync() with an incorrect
length (shorter than what it is supposed to sync). If that's the case you should fix the
client (DFSOutputStream), rather than the NameNode.
 # Looking at the log, your minimal replication number is 2, rather than 1. That's very unusual.
In my past experience a lot of weird behavior like this could arise when you have that kind
of configuration.
 # And why is close() in the picture? IMHO you don't even need to close(). Suppose you block
DataNode heartbeat, and let client keep the file open and then call sync(), the last block's
state remains in COMMITTED. Would that cause the same behavior?
 # Looking at the patch, I would like to ask you to stay away from using reflection. You could
refactor FSNamesystem and DFSOutputStream to return a new FSNamesystem/DFSOutputStream object
and override them in the test code. That way, you don't need to introduce new configurations
too. And it'll be much cleaner.
 # I don't understand the following code. 
 ## if lastBlockLength <= 0 || lastBlockLength <= b.getNumBytes() is unexpected. You
should not just log a debug message and ignore it. It's got to be a WARN level message. You
should also log the size of b.getNumBytes() as well. There's also a grammatical error in the
log message too.
 ## If your fix is correct, you should update the assertion in FileUnderConstructionFeature#updateLengthOfLastBlock()
so it expects neither COMMITTED nor COMPLETE.
 ## What should it do when block state is unexpected? I don't think you should just ignore
it.

{code:java}
 BlockInfo b = pendingFile.getLastBlock();
 if (lastBlockLength <= 0 || lastBlockLength <= b.getNumBytes()) {
 LOG.debug("lastBlockLength(" + lastBlockLength + ") seems wrong, maybe have a bug here?");
 return;
 }

 if (b.getBlockUCState() != BlockUCState.COMMITTED &&
 b.getBlockUCState() != BlockUCState.COMPLETE) {{code}
 

 


was (Author: jojochuang):
Hi [~gzh1992n] thanks very much for reporting the issue. The NN log is useful too.

I looked at the patch & log and tried to understand where the problem is. I don't think
I understand the problem fully, but here are some of the thoughts that I'd like to share with
you.
 # It seems to me the root of problem is that client would call fsync() with an incorrect
length (shorter than what it is supposed to sync). If that's the case you should fix the
client (DFSOutputStream), rather than the NameNode.
 # Looking at the log, your minimal replication number is 2, rather than 1. That's very unusual.
In my past experience a lot of weird behavior like this could arise when you have that kind
of configuration.
 # Looking at the patch, I would like to ask you to stay away from using reflection. You could
refactor FSNamesystem and DFSOutputStream to return a new FSNamesystem/DFSOutputStream object
and override them in the test code. That way, you don't need to introduce new configurations
too. And it'll be much cleaner.
 # I don't understand the following code. 
 ## if lastBlockLength <= 0 || lastBlockLength <= b.getNumBytes() is unexpected. You
should not just log a debug message and ignore it. It's got to be a WARN level message. You
should also log the size of b.getNumBytes() as well. There's also a grammatical error in the
log message too.
 ## If your fix is correct, you should update the assertion in FileUnderConstructionFeature#updateLengthOfLastBlock()
so it expects neither COMMITTED nor COMPLETE.
 ## What should it do when block state is unexpected? I don't think you should just ignore
it.

{code:java}
 BlockInfo b = pendingFile.getLastBlock();
 if (lastBlockLength <= 0 || lastBlockLength <= b.getNumBytes()) {
 LOG.debug("lastBlockLength(" + lastBlockLength + ") seems wrong, maybe have a bug here?");
 return;
 }

 if (b.getBlockUCState() != BlockUCState.COMMITTED &&
 b.getBlockUCState() != BlockUCState.COMPLETE) {{code}
 

 

> Get CorruptBlock because of calling close and sync in same time
> ---------------------------------------------------------------
>
>                 Key: HDFS-13243
>                 URL: https://issues.apache.org/jira/browse/HDFS-13243
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.2, 3.2.0
>            Reporter: Zephyr Guo
>            Priority: Critical
>             Fix For: 3.2.0
>
>         Attachments: HDFS-13243-v1.patch, HDFS-13243-v2.patch
>
>
> HDFS File might get broken because of corrupt block(s) that could be produced by calling
close and sync in the same time.
> When calling close was not successful, UCBlock status would change to COMMITTED, and
if a sync request gets popped from queue and processed, sync operation would change the last
block length.
> After that, DataNode would report all received block to NameNode, and will check Block
length of all COMMITTED Blocks. But the block length was already different between recorded
in NameNode memory and reported by DataNode, and consequently, the last block is marked as
corruptted because of inconsistent length.
>  
> {panel:title=Log in my hdfs}
> 2018-03-05 04:05:39,261 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1085498930_11758129\{UCState=UNDER_CONSTRUCTION,
truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-32c7e479-3845-4a44-adf1-831edec7506b:NORMAL:10.0.0.219:50010|RBW],
ReplicaUC[[DISK]DS-a9a5d653-c049-463d-8e4a-d1f0dc14409c:NORMAL:10.0.0.220:50010|RBW], ReplicaUC[[DISK]DS-f2b7c04a-b724-4c69-abbf-d2e416f70706:NORMAL:10.0.0.218:50010|RBW]]}
for /hbase/WALs/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com,16020,1519845790686/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com%2C16020%2C1519845790686.default.1520193926515
> 2018-03-05 04:05:39,760 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* fsync: /hbase/WALs/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com,16020,1519845790686/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com%2C16020%2C1519845790686.default.1520193926515
for DFSClient_NONMAPREDUCE_1077513762_1
> 2018-03-05 04:05:39,761 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK*
blk_1085498930_11758129\{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-32c7e479-3845-4a44-adf1-831edec7506b:NORMAL:10.0.0.219:50010|RBW],
ReplicaUC[[DISK]DS-a9a5d653-c049-463d-8e4a-d1f0dc14409c:NORMAL:10.0.0.220:50010|RBW], ReplicaUC[[DISK]DS-f2b7c04a-b724-4c69-abbf-d2e416f70706:NORMAL:10.0.0.218:50010|RBW]]}
is not COMPLETE (ucState = COMMITTED, replication# = 0 < minimum = 2) in file /hbase/WALs/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com,16020,1519845790686/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com%2C16020%2C1519845790686.default.1520193926515
> 2018-03-05 04:05:39,761 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated:
10.0.0.220:50010 is added to blk_1085498930_11758129\{UCState=COMMITTED, truncateBlock=null,
primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-32c7e479-3845-4a44-adf1-831edec7506b:NORMAL:10.0.0.219:50010|RBW],
ReplicaUC[[DISK]DS-a9a5d653-c049-463d-8e4a-d1f0dc14409c:NORMAL:10.0.0.220:50010|RBW], ReplicaUC[[DISK]DS-f2b7c04a-b724-4c69-abbf-d2e416f70706:NORMAL:10.0.0.218:50010|RBW]]}
size 2054413
> 2018-03-05 04:05:39,761 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap:
blk_1085498930 added as corrupt on 10.0.0.219:50010 by hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com/10.0.0.219
because block is COMMITTED and reported length 2054413 does not match length in block map
141232
> 2018-03-05 04:05:39,762 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap:
blk_1085498930 added as corrupt on 10.0.0.218:50010 by hb-j5e517al6xib80rkb-004.hbase.rds.aliyuncs.com/10.0.0.218
because block is COMMITTED and reported length 2054413 does not match length in block map
141232
> 2018-03-05 04:05:40,162 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK*
blk_1085498930_11758129\{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-32c7e479-3845-4a44-adf1-831edec7506b:NORMAL:10.0.0.219:50010|RBW],
ReplicaUC[[DISK]DS-a9a5d653-c049-463d-8e4a-d1f0dc14409c:NORMAL:10.0.0.220:50010|RBW], ReplicaUC[[DISK]DS-f2b7c04a-b724-4c69-abbf-d2e416f70706:NORMAL:10.0.0.218:50010|RBW]]}
is not COMPLETE (ucState = COMMITTED, replication# = 3 >= minimum = 2) in file /hbase/WALs/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com,16020,1519845790686/hb-j5e517al6xib80rkb-006.hbase.rds.aliyuncs.com%2C16020%2C1519845790686.default.1520193926515
> {panel}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message