hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fengdong Yu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-4688) DFSClient should not allow multiple concurrent creates for the same file
Date Fri, 12 Apr 2013 04:47:18 GMT

     [ https://issues.apache.org/jira/browse/HDFS-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Fengdong Yu updated HDFS-4688:

    Attachment:     (was: HDFS-4688.txt)
> DFSClient should not allow multiple concurrent creates for the same file
> ------------------------------------------------------------------------
>                 Key: HDFS-4688
>                 URL: https://issues.apache.org/jira/browse/HDFS-4688
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.0.3-alpha
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: TestBadFileMaker.java
> Credit to Harsh for tracing down most of this.
> If a DFSClient does create with overwrite multiple times on the same file, we can get
into bad states. The exact failure mode depends on the state of the file, but at the least
one DFSOutputStream will "win" over the others, leading to data loss in the sense that data
written to the other DFSOutputStreams will be lost. While this is perhaps okay because of
overwrite semantics, we've also seen other cases where the DFSClient loops indefinitely on
close and blocks get marked as corrupt. This is not okay.
> One fix for this is adding some locking to DFSClient which prevents a user from opening
multiple concurrent output streams to the same path.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message