hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind Bhandarkar (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3107) HDFS truncate
Date Tue, 20 Mar 2012 23:05:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233922#comment-13233922

Milind Bhandarkar commented on HDFS-3107:

I understand that truncate adds more complexity, and have discussed the design at length offline
with Sanjay and Hairong. We plan to reuse the append-pipeline for this, and therefore restricted
the API to only work with closed files. (We have submitted the exact use case presentation
proposal to Hadoop summit, without exposing it to public voting currently, but hopefully would
be able to publicly announce it in a few weeks.) The transaction feature is not HDFS-specific,
but is at application-level, and works with other file systems that support truncate.

bq. I don't follow... we don't even expose append() via the shell.

Indeed, I was not talking about Apache Hadoop, but a distribution that includes this feature.

bq. Otherwise I'm more inclined to agree with Eli's suggestion to remove append entirely (please
continue that discussion on-list, though).

Is the proposal to remove appends from all 1.x+ versions of Hadoop or just the 1.x versions
> HDFS truncate
> -------------
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>            Reporter: Lei Chang
>         Attachments: HDFS_truncate_semantics_Mar15.pdf
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
> Systems with transaction support often need to undo changes made to the underlying storage
when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix
operation) which is a reverse operation of append, which makes upper layer applications use
ugly workarounds (such as keeping track of the discarded byte range per file in a separate
metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome
this limitation of HDFS.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message