hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-744) Support hsync in HDFS
Date Mon, 07 May 2012 17:30:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269798#comment-13269798

Lars Hofhansl commented on HDFS-744:

Thanks Stack.

The sync flags does two things: it causes blocks to be fsync'ed upon close at the DN, and
it causes the DN to fsync right away when the client issues a sync().
I can call it force (the closest posix equivalent would be O_SYNC). I can also add a FileMode
(which eventually could include append, overwrite, and sync, but would only do sync for now).

bq. Does this mean we don't need to set the sync flag for the whole filesystem such that on
close we call filechannel force? We can do it on a file-by-file basis?

bq. The sync flag in the checksum filesystem seems to line up w/ the create parent flag in
the method call. Thats probably not what you want?
Whoops. Yeah, checksum filesystem had a private create method that happened to match the signature.
Will fix. Means the signatures in Filesystem need to be different.

bq. Are you thinking we'd work on optimizations later, after this patch went in; e.g. sync
only the first block in pipeline while we are replicating to the other pipeline members, etc.?
That'd be a bigger change. Need to sync in a separate thread, then flush to replicas, then
wait for sync and flush to finish. What I had in mind was to sync synchronously at one of
the replicas.

Also noticed that new create method in FileSystem calls itself, rather than the one that does
not have the sync flag.

> Support hsync in HDFS
> ---------------------
>                 Key: HDFS-744
>                 URL: https://issues.apache.org/jira/browse/HDFS-744
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>         Attachments: hdfs-744.txt
> HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, the real
expected semantics should be "flushes out to all replicas and all replicas have done posix
fsync equivalent - ie the OS has flushed it to the disk device (but the disk may have it in
its cache)." This jira aims to implement the expected behaviour.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message