hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinay (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5428) under construction files deletion after snapshot+checkpoint+nn restart leads nn safemode
Date Thu, 07 Nov 2013 02:12:17 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815558#comment-13815558
] 

Vinay commented on HDFS-5428:
-----------------------------

Hi Jing, thanks for posting the simplified patch.

Patch looks quite good, making all unit test in my patch pass.

Small improvements required to satisfy below points as well.
bq. (From issue Description) So when the Datanode reports RBW blocks those will not be updated
in blocksmap. Some of the FINALIZED blocks will be marked as corrupt due to length mismatch.
This problem is still there, because while loading the fsimage, snapshot inodes are not replaced
with an UCInode and last block is COMPLETE. In this case after reloading from fsimage we will
not be able to read the last block. 
Replacing such inodes with UCInode is required.


> under construction files deletion after snapshot+checkpoint+nn restart leads nn safemode
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-5428
>                 URL: https://issues.apache.org/jira/browse/HDFS-5428
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Vinay
>            Assignee: Vinay
>         Attachments: HDFS-5428-v2.patch, HDFS-5428.000.patch, HDFS-5428.patch
>
>
> 1. allow snapshots under dir /foo
> 2. create a file /foo/test/bar and start writing to it
> 3. create a snapshot s1 under /foo after block is allocated and some data has been written
to it
> 4. Delete the directory /foo/test
> 5. wait till checkpoint or do saveNameSpace
> 6. restart NN.
> NN enters to safemode.
> Analysis:
> Snapshot nodes loaded from fsimage are always complete and all blocks will be in COMPLETE
state. 
> So when the Datanode reports RBW blocks those will not be updated in blocksmap.
> Some of the FINALIZED blocks will be marked as corrupt due to length mismatch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message