hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-955) FSImage.saveFSImage can lose edits
Date Thu, 11 Feb 2010 23:30:28 GMT

    [ https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832730#action_12832730
] 

Todd Lipcon commented on HDFS-955:
----------------------------------

bq. One thing I'd also like to consider more is the interaction of these processes with filesystem
journaling

I read up a bit on journaling implementations in ext3, ext4, and xfs. It looks like the current
"write file, close, then rename to replace" method is considered an antipattern by the kernel
folks, since with delayed allocation it's possible that on a crash the new file will end up
with length 0 (or zeroed blocks in XFS). We definitely need an fsync after writing the new
fsimage. Will open a separate JIRA for this.

> FSImage.saveFSImage can lose edits
> ----------------------------------
>
>                 Key: HDFS-955
>                 URL: https://issues.apache.org/jira/browse/HDFS-955
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-955-unittest.txt, PurgeEditsBeforeImageSave.patch
>
>
> This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage function
(implementing dfsadmin -saveNamespace) can corrupt the NN storage such that all current edits
are lost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message