Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6A2F5D0EE for ; Tue, 6 Nov 2012 22:20:12 +0000 (UTC) Received: (qmail 84012 invoked by uid 500); 6 Nov 2012 22:20:12 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 83967 invoked by uid 500); 6 Nov 2012 22:20:12 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 83957 invoked by uid 99); 6 Nov 2012 22:20:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Nov 2012 22:20:12 +0000 Date: Tue, 6 Nov 2012 22:20:12 +0000 (UTC) From: "Colin Patrick McCabe (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1549023768.77287.1352240412157.JavaMail.jiratomcat@arcas> In-Reply-To: <1622035329.58100.1351808653017.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HDFS-4140) fuse-dfs silently truncates files being overwritten MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491883#comment-13491883 ] Colin Patrick McCabe commented on HDFS-4140: -------------------------------------------- The description is a little misleading here. Basically, the problem is that this operation: {code} open("/mnt/fuse-dfs/t", O_CREAT | O_TRUNC | O_WRONLY, 0644); {code} gets translated into this sequence of fuse-dfs calls: {code} TRACE open /t TRACE truncate /t TRACE unlink /t TRACE getattr /t TRACE flush /t TRACE release /t {code} (I'm assuming that another open would have followed if our unlink hadn't returned an error.) There are a few different quality-of-implementation issues here: * hdfs doesn't react too well to unlink of a file while it's open, which we're doing here * truncate tries to do ts own create + close cycle in the middle, which basically means that we're trying to open a file for write while it's already open-- not good. {{FUSE_CAP_ATOMIC_O_TRUNC}} could help stop fuse from translating open into SO MANY complicated fuse operations. However, it's not supported for all kernel versions (I think definitely not on CentOS 5, for example.) There are a bunch of hacks we could do to "fix" this on older kernels, but they won't be easy. > fuse-dfs silently truncates files being overwritten > --------------------------------------------------- > > Key: HDFS-4140 > URL: https://issues.apache.org/jira/browse/HDFS-4140 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs > Affects Versions: 2.0.2-alpha > Reporter: Andy Isaacson > Assignee: Colin Patrick McCabe > > When fuse-dfs is mount in RW mode, overwriting a file that has content results in the file being truncated to 0 bytes (losing both the old and the new content). > {noformat} > ubuntu@ubu-cdh-0:~$ echo foo > /export/hdfs/tmp/a/t1.txt > ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a > total 0 > -rw-r--r-- 1 ubuntu hadoop 4 Nov 1 15:21 t1.txt > ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a > Found 1 items > -rw-r--r-- 3 ubuntu hadoop 4 2012-11-01 15:21 /tmp/a/t1.txt > ubuntu@ubu-cdh-0:~$ echo bar > /export/hdfs/tmp/a/t1.txt > ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a > total 0 > -rw-r--r-- 1 ubuntu hadoop 0 Nov 1 15:22 t1.txt > ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a > Found 1 items > -rw-r--r-- 3 ubuntu hadoop 0 2012-11-01 15:22 /tmp/a/t1.txt > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira