hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Edison <sediso...@gmail.com>
Subject Re: How to Backup HDFS data ?
Date Thu, 24 Jan 2013 23:34:59 GMT
Backup to disks is what we do right now. Distcp would copy across HDFS
clusters, meaning by I will have to build another 12 node cluster ? Is that
correct ?


On Thu, Jan 24, 2013 at 3:32 PM, Mathias Herberts <
mathias.herberts@gmail.com> wrote:

> Backup on tape or on disk?
>
> On disk, have another Hadoop cluster dans do regular distcp.
>
> On tape, make sure you have a backup program which can backup streams
> so you don't have to materialize your TB files outside of your Hadoop
> cluster first... (I know Simpana can't do that :-().
>
> On Fri, Jan 25, 2013 at 12:29 AM, Steve Edison <sedison70@gmail.com>
> wrote:
> > Folks,
> >
> > Its been an year and my HDFS / Solar /Hive setup is working flawless. The
> > data logs which were meaningless to my business all of a sudden became
> > precious to the extent that our management wants to backup this data. I
> am
> > talking about 20 TB of active HDFS data with an incremental of 2
> TB/month.
> > We would like to have weekly and monthly backups upto 12 months.
> >
> > Any ideas how to do this ?
> >
> > -- Steve
>

Mime
View raw message