hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4140) fuse-dfs handles open(O_TRUNC) poorly
Date Thu, 08 Nov 2012 19:34:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493429#comment-13493429

Colin Patrick McCabe commented on HDFS-4140:

I will add a reference to the JIRA in the comment.

Do you have an idea for how to solve this problem in the "long term"?  I have thought about
it and there are no really good choices.  Basically, it boils down to HDFS not supporting
random writes.  We could implement some kind of caching layer that simulated random writes
in fuse-dfs, but it would be complex.  The performance would also be pretty poor.

We could be a little bit more clever about when we actually open the HDFS file.  For example,
we could convert {{fd = open(O_WRONLY) ; ftruncate(fd, 0)}} to a {{create(overwrite = true)}}.
 That only solves one particular case, though.

I felt like this particular change was worth doing because it's a small change which simplifies
things.  If we were going to write a caching layer we'd want to have a design discussion first.

bq. I think we should just #error...

We have to support RHEL5, for a little while longer at least.
> fuse-dfs handles open(O_TRUNC) poorly
> -------------------------------------
>                 Key: HDFS-4140
>                 URL: https://issues.apache.org/jira/browse/HDFS-4140
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fuse-dfs
>    Affects Versions: 2.0.2-alpha
>            Reporter: Andy Isaacson
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4140.003.patch
> fuse-dfs handles open(O_TRUNC) poorly.
> It is converted to multiple fuse operations.  Those multiple fuse operations often fail
(for example, calling fuse_truncate_impl() while a file is also open for write results in
a "multiple writers!" exception.)
> One easy way to see the problem is to run the following sequence of shell commands:
> {noformat}
> ubuntu@ubu-cdh-0:~$ echo foo > /export/hdfs/tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a
> total 0
> -rw-r--r-- 1 ubuntu hadoop 4 Nov  1 15:21 t1.txt
> ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a
> Found 1 items
> -rw-r--r--   3 ubuntu hadoop          4 2012-11-01 15:21 /tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ echo bar > /export/hdfs/tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a
> total 0
> -rw-r--r-- 1 ubuntu hadoop 0 Nov  1 15:22 t1.txt
> ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a
> Found 1 items
> -rw-r--r--   3 ubuntu hadoop          0 2012-11-01 15:22 /tmp/a/t1.txt
> {noformat}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message