hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim R. Wilson" <wilson.ji...@gmail.com>
Subject Re: Mirroring data to a non-Hadoop FS
Date Fri, 16 May 2008 18:43:35 GMT
There was some chatter on the Hbase list about a dual hdfs/s3 driver
class which would write to both but only read from hdfs.  Of course,
having this functionality at the hadoop level would be better than in
a subsidiary project.

Maybe the ability to specify a secondary filesystem in the
hadoop-site.xml?  Candidates might include S3, NFS, or of course,
another HDFS in a geographically isolated location.

-- Jim R. Wilson (jimbojw)

On Fri, May 16, 2008 at 12:06 PM, Ted Dunning <tdunning@veoh.com> wrote:
> Why not go to the next step and use a second cluster as the backup?
> On 5/16/08 6:33 AM, "Robert Kr├╝ger" <krueger@signal7.de> wrote:
>> Hi,
>> what are the options to keep a copy of data from an HDFS instance in
>> sync with a backup file system which is not HDFS? Are there Rsync-like
>> tools that allow only to transfer deltas or would one have to implement
>> that oneself (e.g. by writing a java program that accesses both
>> filesystems)?
>> Thanks in advance,
>> Robert
>> P.S.: Why would one want that? E.g. to have a completely redundant copy
>> which in case of systematic failure (e.g. data corruption due to a bug)
>> offers a backup not affected by that problem.

View raw message