hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3107) HDFS truncate
Date Tue, 16 Sep 2014 22:43:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136404#comment-14136404
] 

Konstantin Shvachko commented on HDFS-3107:
-------------------------------------------

Hey [~cmccabe], most of your questions are answered in [my earlier comment|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=14123590&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14123590].
This is the design in a nutshell. The [snapshots issue is discussed here|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=14127371&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14127371].
I'll try to answer your questions in more details.

??What are the use-cases.??
* The main use case as far as I understand from this and other conversations is transaction
handling for external databases. DB writes its transactions into a HDFS file. While transactions
succeeds the DB keeps writing the same file. But when tx fails it is aborted and the file
is truncated to the previous successfull transaction.
Yours are also good use cases. 

??if some client reads a file at position X after it has been truncted to X-1, what might
the client see???
* The client will see EOF. Should be the same as in reading beyond the file length after it
was created (no truncate).

??If data is appended after a truncate, it seems like we could get divergent histories in
many situations, where one client sees one timeline and another client sees another.??
* There is no divergent history. If you truncate you loose data that you truncated. You will
not be able to open file for append until truncate is catually comleted and DNs shrink the
last block replicas. Then file can be opened for append and add new data.

??Are we going to guarantee that files never appear to shrink while clients have them open???
* Correct, truncate can be applied only to closed file. If the file is opened for write an
attempt to truncate fails.

??How does this interact with hflush and hsync???
* Truncate is not applicable to open files, so it does not interact with hflush and hsync,
which are applicable to open files only.

??How this interacts with snapshots.??
* This is something yet to be designed as [Nicholas mentioned|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=14129351&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14129351].
Targeted in HDFS-7056. Three options [have been proposed|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=14127371&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14127371].
I was looking at option two, where we reduce file length, but keep the data unchanged until
it is needed in snapshots. You are right, this does not work with interleaving truncates and
appends.
Current implementation prohibits truncate if file has an active snapshot.

??how it's going to be implemented???
* There is a patch attached. Did you have a chance to review? It is much simpler than append,
but it does not allow to truncate files in snapshots. If we decide to implement copy-on-write
approach for truncated files in snapshots, then we may end up creating a branch.

> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Lei Chang
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS-3107.patch, HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the underlying storage
when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix
operation) which is a reverse operation of append, which makes upper layer applications use
ugly workarounds (such as keeping track of the discarded byte range per file in a separate
metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome
this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message