hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei Chang (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3107) HDFS truncate
Date Fri, 23 Mar 2012 05:54:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236370#comment-13236370
] 

Lei Chang commented on HDFS-3107:
---------------------------------

> Lei, do you see any issues with the proposal (i.e option 2) ?

I do like the second option :), for its simplicity to implement atomicity. I also think it
is better to not change the API which should be much easier to use for end users. truncate(file,
length, concatFile) can be used internally.

For the third option (1), it indeed introduces a lot of difficulties in implementation, such
as fault tolerance implementation issues. This has been seen in our first try to implement
truncate which takes a more strong semantic. 

IMO, for visibility, the weak consistency for concurrent read is ok for upper layer applications.
 For instance, most database systems already use their own locks to synchronize the concurrent
access to files or file blocks.

                
> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>            Reporter: Lei Chang
>         Attachments: HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the underlying storage
when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix
operation) which is a reverse operation of append, which makes upper layer applications use
ugly workarounds (such as keeping track of the discarded byte range per file in a separate
metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome
this limitation of HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message