hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4089) SyncBehindWrites uses wrong flags on sync_file_range
Date Fri, 19 Oct 2012 21:48:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480422#comment-13480422

Todd Lipcon commented on HDFS-4089:

Hi Jan. The purpose of this flag isn't for data integrity -- it's to avoid lumpy IO writeback.
If you want data integrity you should be using the hsync() call after every write.
> SyncBehindWrites uses wrong flags on sync_file_range
> ----------------------------------------------------
>                 Key: HDFS-4089
>                 URL: https://issues.apache.org/jira/browse/HDFS-4089
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, ha
>            Reporter: Jan Kunigk
>            Priority: Minor
>         Attachments: syncBehindWrites.patch
> Hi, I stumbled upon this while trying to understand the append design recently. I am
assuming when SyncBehindWrites is enabled we do indeed want to do a complete sync after each
write. In that case the implementation seems wrong to me.
> Here's a comment from the manpage of sync_file_range on the usage of the SYNC_FILE_RANGE_WRITE
flag in solitude: "This is an asynchronous flush-to-disk operation. This is not suitable for
data integrity operations." I don't know why this syscall is invoked here instead of just

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message