hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7930) commitBlockSynchronization() does not remove locations
Date Wed, 18 Mar 2015 22:29:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368053#comment-14368053
] 

Konstantin Shvachko commented on HDFS-7930:
-------------------------------------------

Although this will not fix the {{testTruncateWithDataNodesRestart()}} completely. The location
is correctly invalidated on the NN, but then NN postpones invalidation on the DN and waits
for the next report.
{code}
2015-03-18 15:11:02,922 INFO  BlockStateChange (CorruptReplicasMap.java:addToCorruptReplicasMap(76))
- BLOCK NameSystem.addToCorruptReplicasMap: blk_1073741827 added as corrupt on 127.0.0.1:46044
by localhost/127.0.0.1  because block is COMPLETE and reported genstamp 1003 does not match
genstamp in block map 1004
2015-03-18 15:11:02,922 INFO  BlockStateChange (BlockManager.java:invalidateBlock(1215)) -
BLOCK* invalidateBlock: blk_1073741827_1003(stored=blk_1073741827_1004) on 127.0.0.1:46044
2015-03-18 15:11:02,922 INFO  BlockStateChange (BlockManager.java:invalidateBlock(1225)) -
BLOCK* invalidateBlocks: postponing invalidation of blk_1073741827_1003(stored=blk_1073741827_1004)
on 127.0.0.1:46044 because 1 replica(s) are located on nodes with potentially out-of-date
block reports
{code}
If I add {{triggerBlockReports()}} before {{waitReplication()}} then the test passes, as it
finally triggers deletion of the replica on the DN.
I am fine fixing the test by adding  {{triggerBlockReports()}} as above, but I don't know
what is the reason for postponing replica deletion. Postponing should probably be avoided
in this case, since the {{commitBlockSync()}} is as good as block report for the particular
block.

BTW, your change completely eliminates the failure of {{testTruncateWithDataNodesRestartImmediately()}}
from HDFS-7886, which I ran without {{triggerBlockReports()}}.

> commitBlockSynchronization() does not remove locations
> ------------------------------------------------------
>
>                 Key: HDFS-7930
>                 URL: https://issues.apache.org/jira/browse/HDFS-7930
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Konstantin Shvachko
>            Assignee: Yi Liu
>            Priority: Blocker
>         Attachments: HDFS-7930.001.patch, HDFS-7930.002.patch
>
>
> When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block
it does not remove unconfirmed locations. This results in that the the block stores locations
of different lengths or genStamp (corrupt).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message