hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-970) FSImage writing should always fsync before close
Date Tue, 11 May 2010 07:11:43 GMT

     [ https://issues.apache.org/jira/browse/HDFS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HDFS-970:

    Attachment: hdfs-970.txt

Trivial patch to add fsyncs to the image saving, fstime writing, and TransferFsImage code.
There are no tests included, since the only way to test this is to pull a power plug :)

There is probably some negative performance impact, depending on size of files, dirty page
limits, available RAM, etc, but I think the safety factor is well worth it!

> FSImage writing should always fsync before close
> ------------------------------------------------
>                 Key: HDFS-970
>                 URL: https://issues.apache.org/jira/browse/HDFS-970
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-970.txt
> Without an fsync, it's common that filesystems will delay the writing of metadata to
the journal until all of the data blocks have been flushed. If the system crashes while the
dirty pages haven't been flushed, the file is left in an indeterminate state. In some FSs
(eg ext4) this will result in a 0-length file. In others (eg XFS) it will result in the correct
length but any number of data blocks getting zeroed. Calling FileChannel.force before closing
the FSImage prevents this issue.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message