hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serge Blazhiyevskyy <Serge.Blazhiyevs...@nice.com>
Subject Re: backup of hdfs data
Date Tue, 06 Nov 2012 07:40:30 GMT
I second this proposed solution. Distcp work very well with backing up data on the separate
cluster

From: Bharath Mundlapudi <bharathwork@yahoo.com<mailto:bharathwork@yahoo.com>>
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>,
Bharath Mundlapudi <bharathwork@yahoo.com<mailto:bharathwork@yahoo.com>>
Date: Tuesday, November 6, 2012 7:10 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: backup of hdfs data

If data is less in your cluster (say less than few GBs) then answer is yes. But it is an expensive
route. For large data sets, traditional means is not feasible and it is expensive.
If you want optimal cost based solution, you could setup another local/remote cluster and
try discp or simply copy hdfs files to JBODs. Disk is cheap :).

-Bharath


________________________________
From: uday chopra <uday.chopra.1970@gmail.com<mailto:uday.chopra.1970@gmail.com>>
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Sent: Monday, November 5, 2012 4:19 PM
Subject: backup of hdfs data

What do folks do to backup hdfs data?
Has anyone experience in trying to use enterprise solutions such as netbackup with datadomain
D-2-D appliance for doing backups of data in hdfs? If so, what is the average dedup ratio?
(I understand mileage can vary based on the type of data)

Thanks,
Uday



Mime
View raw message