hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
Date Fri, 02 Nov 2012 07:28:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489296#comment-13489296
] 

Konstantin Shvachko commented on HDFS-2802:
-------------------------------------------

> .snapshot convention

Agreed, this seems to be a recognized convention. This also helps to resolve name collision
on subdirectory snapshots. Referring to my earlier example:
{code}
dr/.snapshot/1/sd2/file.txt
dr/sd2/.snapshot/1/file.txt
{code}
The first points to {{file.txt}} under snapshot #1 for {{dr/}}. And the second refers to {{file.txt}}
under snapshot #1 for {{sd2/}}.
The API makes sense to me. I like the use of regular rm and ls commands dealing with snapshots.

I would make the snapshot name optional in {{-createSnapshot}} (choosing names is such a hastle).
Should be easy since you already have internal unique ids. Looking at Aaron's last design
I see similar APIs, modular plural {{-allowSnapshots}}, and moving all snapshot operations
under dfsadmin. The latter is not plausable, as users should be able to create and maintain
snapshots as long as the administrator allowed it.

Suresh, do I understand correctly that files under .snapshot are read-only, so {{rm -r}} will
not remove it. You need to clarify how regular {{rm -r}} works in the design.
Same with {{ls -r}}. I see Aaron proposes to skip .snapshot, which makes sense.
Should we have something like {{ls -rs}} to list snapshotable directories in the tree?

Aaron, I understand that you propose to store consequent ranges of snapshot versions in snapshoted
files, and I see you adapted this "to support subtree snapshots".
Not sure how you will handle the snapshot deletion in the middle of the range. Say you have
snapshots 1 through 5, and then delete snapshot 3.
The idea of monotonously increasing snapshot ids seems to be the only major diversion of your
design from Suresh & Co. No doubt it should be discussed in details, but does that justify/qualify-for
submitting an alternative design. You don't want to start Apache Hadoop design-document-wars
here, do you?

                
> Support for RW/RO snapshots in HDFS
> -----------------------------------
>
>                 Key: HDFS-2802
>                 URL: https://issues.apache.org/jira/browse/HDFS-2802
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>            Reporter: Hari Mankude
>            Assignee: Hari Mankude
>         Attachments: HDFS-2802.20121101.patch, HDFSSnapshotsDesign.pdf, snap.patch, snapshot-design.pdf,
snapshot-design.tex, snapshot-one-pager.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire filesystem.
Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are
several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more
information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message