hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
Date Mon, 26 Jan 2015 21:42:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292454#comment-14292454
] 

Jing Zhao commented on HDFS-7611:
---------------------------------

Thanks for digging into the issue, [~Byron Wong]!

So currently we have two ways to fix the issue:
# While applying the editlog, instead of calling {{INode#addSpaceConsumed}}, we should use
{{FSDirectory#updateCount}} which checks if image/editlog has been loaded.
# We do not compute quota change and update quota usage in {{FSDirectory#removeLastINode}}
anymore. Instead, we move the quota computation/update part to its caller. In this way, the
quota usage change, even if it's wrong, will not affect the real deletion.

Both changes actually are necessary. But #1 requires a lot of code refactoring. Since #2 alone
can also fix the reported bug, I guess we can do #1 in a separate jira. 

> deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode
restart.
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7611
>                 URL: https://issues.apache.org/jira/browse/HDFS-7611
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Konstantin Shvachko
>            Assignee: Byron Wong
>            Priority: Critical
>         Attachments: blocksNotDeletedTest.patch, testTruncateEditLogLoad.log
>
>
> If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a
file can leave  orphaned  blocks in the blocksMap on NameNode restart. They are counted as
missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause
memory leak during startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message