hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
Date Sat, 20 Oct 2012 00:32:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480561#comment-13480561
] 

Aaron T. Myers commented on HDFS-2802:
--------------------------------------

bq. I agree with you on this. We wanted to begin with this approach and then optimize it further
in memory. The initial patch uploaded here tried premature optimization both for memory and
snapshot creation time and thus made the code really complicated. But this is a definite goal
and that part of the design we will update as we continue to work. This is covered in open
issues/future work section.

My concern is that we're going about this wrong if the plan is to implement a O(# of files
+ # of directories) solution and then optimize this via multi-threading, offloading to disk,
etc., as the design document suggests. We should work on coming up with a design which is
O(1) or something which is roughly O(log(# of files + # of directories)), both in terms of
time and space efficiency. This sort of fundamental design decision is not something that
can be easily improved incrementally. I hope you will be open to reconsidering a reworking
of this design along these lines.
                
> Support for RW/RO snapshots in HDFS
> -----------------------------------
>
>                 Key: HDFS-2802
>                 URL: https://issues.apache.org/jira/browse/HDFS-2802
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>            Reporter: Hari Mankude
>            Assignee: Hari Mankude
>         Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire filesystem.
Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are
several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more
information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message