hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Mankude (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
Date Wed, 09 May 2012 17:31:50 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271616#comment-13271616

Hari Mankude commented on HDFS-2802:

Regarding scenario #3, consider a hbase setup with huge dataset in production. A new app has
been developed which needs to be validated against production dataset. It is not feasible
to copy the entire dataset to a test setup. At the same time, app is not ready for production
and it is not safe to have the app modify the data in the production database. One of the
solutions for these types of problems is to take a RW snapshot of the production dataset and
then have the development app run against the RW snapshot. After the app testing is done,
RW snap is deleted. This assumes that the cluster has sufficient compute capacity and incremental
storage capacity to support RW snaps.

Regarding appends, current prototype of snapshot relies on the filesize that is available
at the namenode. So, if a file is appended after snap is taken, then it is a no-op from a
snap perspective. If a snap is taken of a file which has append pipeline setup, inode is of
type underconstruction in the NN. Prototype relies on filesize that is available on the NN
for snaps. This might not be perfect and I have some ideas on trying to acquire more upto-date

I thought that truncate is not supported currently in the trunk. If you are referring to deletes,
prototype handles deletes correctly without issues. 

I will post a more detailed doc after I am done with HA related work.
> Support for RW/RO snapshots in HDFS
> -----------------------------------
>                 Key: HDFS-2802
>                 URL: https://issues.apache.org/jira/browse/HDFS-2802
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>    Affects Versions: 0.24.0
>            Reporter: Hari Mankude
>            Assignee: Hari Mankude
>         Attachments: snapshot-one-pager.pdf
> Snapshots are point in time images of parts of the filesystem or the entire filesystem.
Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are
several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message