Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D4C391195F for ; Wed, 17 Sep 2014 20:25:34 +0000 (UTC) Received: (qmail 60880 invoked by uid 500); 17 Sep 2014 20:25:34 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 60829 invoked by uid 500); 17 Sep 2014 20:25:34 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 60817 invoked by uid 99); 17 Sep 2014 20:25:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Sep 2014 20:25:34 +0000 Date: Wed, 17 Sep 2014 20:25:34 +0000 (UTC) From: "Colin Patrick McCabe (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-3107) HDFS truncate MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137911#comment-14137911 ] Colin Patrick McCabe commented on HDFS-3107: -------------------------------------------- bq. Using a record-length prefix is not a good fix to get around this. What happens if you fail when writing your record length? In that case, the record is incomplete and not valid. It's pretty clear when bytes are missing from a fixed-length 4 byte record. bq. I would argue that this has everything to do with append. You are absolutely correct that HDFS can write a bad file on a standard open/write. The 'undo' for this failure is the delete operation. Your data integrity is preserved regardless of any external factors (file format, metadata, applications, etc). You can't have bad data if you never write bad data. I don't follow. What does append have to do with writing partial records? You can write partial records without append, and append doesn't make it any more or less likely. As I said earlier, "append" really should have been called "reopen for write". You don't need to use "append" to create and append to a file (confusing, I know) > HDFS truncate > ------------- > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode > Reporter: Lei Chang > Assignee: Plamen Jeliazkov > Attachments: HDFS-3107.patch, HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)