hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4140) fuse-dfs silently truncates files being overwritten
Date Tue, 06 Nov 2012 22:20:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491883#comment-13491883
] 

Colin Patrick McCabe commented on HDFS-4140:
--------------------------------------------

The description is a little misleading here.  Basically, the problem is that this operation:

{code}
open("/mnt/fuse-dfs/t", O_CREAT | O_TRUNC | O_WRONLY, 0644);
{code}

gets translated into this sequence of fuse-dfs calls:

{code}
TRACE open /t
TRACE truncate /t
TRACE unlink /t
TRACE getattr /t
TRACE flush /t
TRACE release /t
{code}

(I'm assuming that another open would have followed if our unlink hadn't returned an error.)

There are a few different quality-of-implementation issues here:
* hdfs doesn't react too well to unlink of a file while it's open, which we're doing here
* truncate tries to do ts own create + close cycle in the middle, which basically means that
we're trying to open a file for write while it's already open-- not good.

{{FUSE_CAP_ATOMIC_O_TRUNC}} could help stop fuse from translating open into SO MANY complicated
fuse operations.  However, it's not supported for all kernel versions (I think definitely
not on CentOS 5, for example.)  There are a bunch of hacks we could do to "fix" this on older
kernels, but they won't be easy.
                
> fuse-dfs silently truncates files being overwritten
> ---------------------------------------------------
>
>                 Key: HDFS-4140
>                 URL: https://issues.apache.org/jira/browse/HDFS-4140
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fuse-dfs
>    Affects Versions: 2.0.2-alpha
>            Reporter: Andy Isaacson
>            Assignee: Colin Patrick McCabe
>
> When fuse-dfs is mount in RW mode, overwriting a file that has content results in the
file being truncated to 0 bytes (losing both the old and the new content).
> {noformat}
> ubuntu@ubu-cdh-0:~$ echo foo > /export/hdfs/tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a
> total 0
> -rw-r--r-- 1 ubuntu hadoop 4 Nov  1 15:21 t1.txt
> ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a
> Found 1 items
> -rw-r--r--   3 ubuntu hadoop          4 2012-11-01 15:21 /tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ echo bar > /export/hdfs/tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a
> total 0
> -rw-r--r-- 1 ubuntu hadoop 0 Nov  1 15:22 t1.txt
> ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a
> Found 1 items
> -rw-r--r--   3 ubuntu hadoop          0 2012-11-01 15:22 /tmp/a/t1.txt
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message