hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wellington Chevreuil (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11821) BlockManager.getMissingReplOneBlocksCount() does not report correct value if corrupt file with replication factor of 1 gets deleted
Date Fri, 08 Sep 2017 15:05:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158762#comment-16158762
] 

Wellington Chevreuil commented on HDFS-11821:
---------------------------------------------

Thanks for the review and relevant comments [~raviprak]! Regarding the tests, I have these
passing locally. Also, apart from *hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080*,
these are totally different from previous build. Also, I don't think the patch changes would
influence pipeline/append/reads, as it's only concerning to metrics update. Pasting test output
snippets from my local build:

{noformat}
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 85.39 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140

Results :

Tests run: 14, Failures: 0, Errors: 0, Skipped: 0
...
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestFileAppendRestart
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.935 sec - in org.apache.hadoop.hdfs.TestFileAppendRestart

Results :

Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
...
Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 121.347 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080

Results :

Tests run: 14, Failures: 0, Errors: 0, Skipped: 0
...
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 75.274 sec - in org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding

Results :

Tests run: 5, Failures: 0, Errors: 0, Skipped: 0
...
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestLeaseRecoveryStriped
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.307 sec - in org.apache.hadoop.hdfs.TestLeaseRecoveryStriped

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
...
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 62.726 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120

Results :

Tests run: 14, Failures: 0, Errors: 0, Skipped: 0
...
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestPipelines
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.92 sec - in org.apache.hadoop.hdfs.TestPipelines

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
...
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.293 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070

Results :

Tests run: 14, Failures: 0, Errors: 0, Skipped: 0
...
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hdfs.TestWriteReadStripedFile
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 73.311 sec - in org.apache.hadoop.hdfs.TestWriteReadStripedFile

Results :

Tests run: 17, Failures: 0, Errors: 0, Skipped: 0
{noformat} 

bq. My concern with your patch is that remove will now be a bit slower. I think I remember
there used to be a time when deletes were holding up the lock for a long time. Kihwal Lee
Do you have an objection?
I guess the major concern here is that *countNodes* method iterates over the block *StorageInfo*
objects, checking the replica state on each storage to decide how is replication health. I
suppose this is limited by the block replication factor, so it wouldn't be large loop. Is
this a correct assumption or would there still be some other overheads that could impact delete
performance?

bq. I'm also wondering what happens when the information returned by countNodes is inaccurate
(i.e. HDFS hasn't yet realized that the block is corrupt)
In that case, I believe the block would not be on any of the priority queues from *LowRedundancyBlocks*.
Call to *LowRedundancyBlocks.remove* would not find any block then, so no counters would be
updated, what I think is the correct behaviour. What do you feel?

> BlockManager.getMissingReplOneBlocksCount() does not report correct value if corrupt
file with replication factor of 1 gets deleted
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11821
>                 URL: https://issues.apache.org/jira/browse/HDFS-11821
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 2.6.0, 3.0.0-alpha2
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Minor
>         Attachments: HDFS-11821-1.patch, HDFS-11821-2.patch
>
>
> *BlockManager* keeps a separate metric for number of missing blocks with replication
factor of 1. This is returned by *BlockManager.getMissingReplOneBlocksCount()* method currently,
and that's what is displayed on below attribute for *dfsadmin -report* (in below example,
there's one corrupt block that relates to a file with replication factor of 1):
> {noformat}
> ...
> Missing blocks (with replication factor 1): 1
> ...
> {noformat}
> However, if the related file gets deleted, (for instance, using hdfs fsck -delete option),
this metric never gets updated, and *dfsadmin -report* will keep reporting a missing block,
even though the file does not exist anymore. The only workaround available is to restart the
NN, so that this metric will be cleared.
> This can be easily reproduced by forcing a replication factor 1 file corruption such
as follows:
> 1) Put a file into hdfs with replication factor 1:
> {noformat}
> $ hdfs dfs -Ddfs.replication=1 -put test_corrupt /
> $ hdfs dfs -ls /
> -rw-r--r--   1 hdfs     supergroup         19 2017-05-10 09:21 /test_corrupt
> {noformat}
> 2) Find related block for the file and delete it from DN:
> {noformat}
> $ hdfs fsck /test_corrupt -files -blocks -locations
> ...
> /test_corrupt 19 bytes, 1 block(s):  OK
> 0. BP-782213640-172.31.113.82-1494420317936:blk_1073742742_1918 len=19 Live_repl=1 [DatanodeInfoWithStorage[172.31.112.178:20002,DS-a0dc0b30-a323-4087-8c36-26ffdfe44f46,DISK]]
> Status: HEALTHY
> ...
> $ find /dfs/dn/ -name blk_1073742742*
> /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742
> /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742_1918.meta
> $ rm -rf /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742
> $ rm -rf /dfs/dn/current/BP-782213640-172.31.113.82-1494420317936/current/finalized/subdir0/subdir3/blk_1073742742_1918.meta
> {noformat}
> 3) Running fsck will report the corruption as expected:
> {noformat}
> $ hdfs fsck /test_corrupt -files -blocks -locations
> ...
> /test_corrupt 19 bytes, 1 block(s): 
> /test_corrupt: CORRUPT blockpool BP-782213640-172.31.113.82-1494420317936 block blk_1073742742
>  MISSING 1 blocks of total size 19 B
> ...
> Total blocks (validated):	1 (avg. block size 19 B)
>   ********************************
>   UNDER MIN REPL'D BLOCKS:	1 (100.0 %)
>   dfs.namenode.replication.min:	1
>   CORRUPT FILES:	1
>   MISSING BLOCKS:	1
>   MISSING SIZE:		19 B
>   CORRUPT BLOCKS: 	1
> ...
> {noformat}
> 4) Same for *dfsadmin -report*
> {noformat}
> $ hdfs dfsadmin -report
> ...
> Under replicated blocks: 1
> Blocks with corrupt replicas: 0
> Missing blocks: 1
> Missing blocks (with replication factor 1): 1
> ...
> {noformat}
> 5) Running *fsck -delete* option does cause fsck to report correct information about
corrupt block, but dfsadmin still shows the corrupt block:
> {noformat}
> $ hdfs fsck /test_corrupt -delete
> ...
> $ hdfs fsck /
> ...
> The filesystem under path '/' is HEALTHY
> ...
> $ hdfs dfsadmin -report
> ...
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 1
> ...
> {noformat}
> The problem seems to be on *BlockManager.removeBlock()* method, which in turn uses util
class *LowRedundancyBlocks* that classifies blocks according to the current replication level,
including blocks currently marked as corrupt. 
> The related metric showed on *dfsadmin -report* for corrupt blocks with replication factor
1 is tracked on this *LowRedundancyBlocks*. Whenever a block is marked as corrupt and it has
replication factor of 1, the related metric is updated. When removing the block, though, *BlockManager.removeBlock()*
is calling *LowRedundancyBlocks.remove(BlockInfo block, int priLevel)*, which does not check
if the given block was previously marked as corrupt and had replication factor 1, which would
require for updating the metric.
> Am shortly proposing a patch that seems to fix this by making *BlockManager.removeBlock()*
 call *LowRedundancyBlocks.remove(BlockInfo block, int oldReplicas, int oldReadOnlyReplicas,
int outOfServiceReplicas, int oldExpectedReplicas)* instead, which does update the metric
properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message