hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-744) Support hsync in HDFS
Date Fri, 18 May 2012 23:17:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279321#comment-13279321

Lars Hofhansl commented on HDFS-744:

Thanks Todd and Nicholas.

* To your first point, this is why I initially had hsync only actually fsync if the stream
was created accordingly. So it's only a performance change if the stream was consciously created
with SYNC_BLOCK. Nicholas had suggested that hsync should always sync. I'm also happy to add
a syncFS(boolean sync) to SequenceFile.Writer.
* Re: sending the flag along with the last packet. I was seeing this more as a set of flags
the describe the stream to the DN (like a file descriptor). For this specific flag it would
be ok to only send it with the last packet of a block, for other (future) flags that might
not be the case. Do you feel strongly about this? Happy to change it, if you do.
* The logic is indeed to avoid double sync. Specifically this is the case where an empty sync-packet
was sent to trigger a sync on the DN. Will add a comment.
* Will make these flags optional with default of false.

I guess the main part to decide: Should DFSOutputStream.hsync() only sync if the stream was
created with SYNC_BLOCK? If not, should we deprecate SequenceFile.Writer.syncFS() and add

> Support hsync in HDFS
> ---------------------
>                 Key: HDFS-744
>                 URL: https://issues.apache.org/jira/browse/HDFS-744
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, hdfs client
>            Reporter: Hairong Kuang
>            Assignee: Lars Hofhansl
>         Attachments: HDFS-744-trunk-v2.patch, HDFS-744-trunk-v3.patch, HDFS-744-trunk-v4.patch,
HDFS-744-trunk-v5.patch, HDFS-744-trunk.patch, hdfs-744-v2.txt, hdfs-744-v3.txt, hdfs-744.txt
> HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, the real
expected semantics should be "flushes out to all replicas and all replicas have done posix
fsync equivalent - ie the OS has flushed it to the disk device (but the disk may have it in
its cache)." This jira aims to implement the expected behaviour.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message