hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-744) Support hsync in HDFS
Date Fri, 18 May 2012 23:17:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279321#comment-13279321
] 

Lars Hofhansl commented on HDFS-744:
------------------------------------

Thanks Todd and Nicholas.

* To your first point, this is why I initially had hsync only actually fsync if the stream
was created accordingly. So it's only a performance change if the stream was consciously created
with SYNC_BLOCK. Nicholas had suggested that hsync should always sync. I'm also happy to add
a syncFS(boolean sync) to SequenceFile.Writer.
* Re: sending the flag along with the last packet. I was seeing this more as a set of flags
the describe the stream to the DN (like a file descriptor). For this specific flag it would
be ok to only send it with the last packet of a block, for other (future) flags that might
not be the case. Do you feel strongly about this? Happy to change it, if you do.
* The logic is indeed to avoid double sync. Specifically this is the case where an empty sync-packet
was sent to trigger a sync on the DN. Will add a comment.
* Will make these flags optional with default of false.

I guess the main part to decide: Should DFSOutputStream.hsync() only sync if the stream was
created with SYNC_BLOCK? If not, should we deprecate SequenceFile.Writer.syncFS() and add
SequenceFile.Writer.syncFS(boolean)?

                
> Support hsync in HDFS
> ---------------------
>
>                 Key: HDFS-744
>                 URL: https://issues.apache.org/jira/browse/HDFS-744
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, hdfs client
>            Reporter: Hairong Kuang
>            Assignee: Lars Hofhansl
>         Attachments: HDFS-744-trunk-v2.patch, HDFS-744-trunk-v3.patch, HDFS-744-trunk-v4.patch,
HDFS-744-trunk-v5.patch, HDFS-744-trunk.patch, hdfs-744-v2.txt, hdfs-744-v3.txt, hdfs-744.txt
>
>
> HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, the real
expected semantics should be "flushes out to all replicas and all replicas have done posix
fsync equivalent - ie the OS has flushed it to the disk device (but the disk may have it in
its cache)." This jira aims to implement the expected behaviour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message