hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3154) Add a notion of immutable/mutable files
Date Tue, 27 Mar 2012 15:02:26 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239539#comment-13239539

Daryn Sharp commented on HDFS-3154:

bq. Users have to pass immutable/mutable as a flag in file creation. This is an unmodifiable
property of the created file.
I think the mutability of a file should be changeable at any time, not just during creation.
 Ie. unix has a "chflags" command.

bq. You should be able to make a copy of an immutable file that takes up no extra space, but
can be appended to or truncated. The immutable file and the copy would share immutable blocks.
I really like this idea!  Awhile back I was proposing (offline discussion) COW copies but
there were questions about whether there is a valid use case.  Modifying an immutable file
would be such a use case.  It probably doesn't make sense to copy a very large file (client
has to stream the data down and back up) just because the user wants to append a little bit
to an immutable file.

Given the lack of random writes, it should be relatively easy to handle append to the final
block.  Either the final block could be re-replicated for an append, or the original file
can "remember" the length of its last block so the block will continue to be shared between
the original file and its copy.  The original file would need to re-replicate the block if
it needs appending and the block is larger than what it thinks it should be -- ie. it's already
been appended.  That gets tricky, so simply re-replicating the final COW block when appended
would be the easiest.

Currently the block manager requires a 1-to-1 block to inode association.  That would have
to be changed to 1-to-many, or a COW block would provide indirection to the real block.  I
think the latter would be tricky unless real blocks contain a reference count.

> Add a notion of immutable/mutable files
> ---------------------------------------
>                 Key: HDFS-3154
>                 URL: https://issues.apache.org/jira/browse/HDFS-3154
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: name-node
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
> The notion of immutable file is useful since it lets the system and tools optimize certain
things as discussed in [this email thread|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201203.mbox/%3CCAPn_vTuZomPmBTypP8_1xTr49Sj0fy7Mjhik4DbcAA+BLH53=g@mail.gmail.com%3E].
 Also, many applications require only immutable files.  Here is a proposal:
> - Immutable files means that the file content is immutable.  Operations such as append
and truncate that change the file content are not allowed to act on immutable files.  However,
the meta data such as replication and permission of an immutable file can be updated.  Immutable
files can also be deleted or renamed.
> - Users have to pass immutable/mutable as a flag in file creation.  This is an unmodifiable
property of the created file.
> - If users want to change the data in an immutable file, the file could be copied to
another file which is created as mutable.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message