hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3979) Fix hsync and hflush semantics.
Date Mon, 05 Nov 2012 20:14:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490887#comment-13490887
] 

Luke Lu commented on HDFS-3979:
-------------------------------

bq. I think it will decrease the performance for non-sync write.

It'll be nice if we can show/quantify the decrease in performance for non-sync writes. It
may not be wise to introduce complexity and make hflush less robust if this is a non-issue.

bq. The existing tests: TestFiPipelines and TestFiHFlush do not cover the other scenarios
you worry about?

It seems that TestFiHFlush doesn't cover the failure scenarios. All the test cases are positive
assertions (pipeline can recover in spite of disk error exceptions), which seems not very
useful given the ack is done before the disk error exceptions are triggered. A new TestFiHSync
seems necessary especially for the new patch, where the ack code path diverged from hflush.
Basically, I want to make sure that hsync would be guaranteed to get an error if the pipeline
cannot be recovered (e.g., due to required datanodes ran out of disk space etc).

Anyway, I'm fine with filing another jira for these hflush/hsync improvement. 


                
> Fix hsync and hflush semantics.
> -------------------------------
>
>                 Key: HDFS-3979
>                 URL: https://issues.apache.org/jira/browse/HDFS-3979
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0, 0.23.0, 2.0.0-alpha
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>         Attachments: hdfs-3979-sketch.txt, hdfs-3979-v2.txt, hdfs-3979-v3.txt, hdfs-3979-v4.txt
>
>
> See discussion in HDFS-744. The actual sync/flush operation in BlockReceiver is not on
a synchronous path from the DFSClient, hence it is possible that a DN loses data that it has
already acknowledged as persisted to a client.
> Edit: Spelling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message