hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-231) Rsync like way of retrieving data from the dfs
Date Sun, 17 Jul 2011 19:19:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066716#comment-13066716

Harsh J commented on HDFS-231:

I'd say that the HDFS is such a system that you do not really require backups as long as you
have active node monitoring; and for a few files its alright if you do it non-incrementally.

For instance, you can also look at dates.

> Rsync like way of retrieving data from the dfs
> ----------------------------------------------
>                 Key: HDFS-231
>                 URL: https://issues.apache.org/jira/browse/HDFS-231
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Johan Oskarsson
>            Assignee: Sameer Paranjpye
> As the dfs in our cluster contains a lot of important data, being able to retrieve them
to a non dfs backup node is essential.
> However, a lot of the files don't change inbetween backups, so a way to get only the
files that have changed would be preferable.
> Since the blocks themselves already have a crc calculated half the job is already done,
if it's possible to split the destination files in similar blocks and calculate the crc for

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message