Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Mon, 6 Oct 2014 17:07:35 +0000 (UTC)
From: "Milind Bhandarkar (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12546721.1331873581000.198725.1412615255446@Atlassian.JIRA>
In-Reply-To: <JIRA.12546721.1331873581000@Atlassian.JIRA>
References: <JIRA.12546721.1331873581000@Atlassian.JIRA>
 <JIRA.12546721.1331873581101@arcas>
Subject: [jira] [Commented] (HDFS-3107) HDFS truncate
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160511#comment-14160511 ] 

Milind Bhandarkar commented on HDFS-3107:
-----------------------------------------

Dhruba, Indeed. Lack of concurrent writes to a single HDFS file means that there will be only a single outstanding transaction against a file (unless the concurrency is implemented at a higher level.) A database can consist of multiple files, though, and one can have multiple outstanding transactions against the database (one per file.) In either case, rollback is achieved by truncating the file to position prior to beginning of transaction.

> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Lei Chang
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, editsStored
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)