hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-744) Support hsync in HDFS
Date Mon, 07 May 2012 06:16:46 GMT

     [ https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Lars Hofhansl updated HDFS-744:

    Attachment: hdfs-744.txt

Here's a sketch of a patch against 1.0.x.
I did only light testing on this.

It uses the packet's lastPacketInBlock byte to send flags to the Datanode. (An old client
will work against a new data node, a new client should work against an old data node as long
the sync feature is not used, but it would not fail gracefully if it did.)

The client can send two flags with a packet:
# sync block
# sync packet

The idea is that #1 would be sent with at least one packet per block in order to sync the
block upon close.
#2 is sent by the client if a sync should be forced immediately on a partial block. If the
client has outstanding data to send anyway the flag is pickbagged on the packet for that data,
otherwise an empty sync packet is sent if needed.

Together they allow a client to guarantee that all bytes up to a certain point are guaranteed
on disk.

Please have a look and let me know whether I'm off track with this.

If not, I'll clean it up, add some tests, create a trunk patch (which I imagine would look
a bit differently), and maybe add a only-the-last-replica-syncs option.

> Support hsync in HDFS
> ---------------------
>                 Key: HDFS-744
>                 URL: https://issues.apache.org/jira/browse/HDFS-744
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>         Attachments: hdfs-744.txt
> HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, the real
expected semantics should be "flushes out to all replicas and all replicas have done posix
fsync equivalent - ie the OS has flushed it to the disk device (but the disk may have it in
its cache)." This jira aims to implement the expected behaviour.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message