hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pablo Musa <pa...@psafe.com>
Subject Re: HDFS Backup for Hadoop Update
Date Tue, 26 Feb 2013 23:53:43 GMT
Following the idea of doing a copy of the data structure I thought about 
rsync.

I could run rsync while the server is ON and later just apply the diff, 
which
would be much faster decreasing system off-line time.
But I do not know if hadoop make a lot of changes into the data 
structure (blocks).

Thanks again,
Pablo

On 02/26/2013 07:39 PM, Pablo Musa wrote:
> Hello guys,
> I am starting the update from hadoop 0.20 to a newer version which changes
> HDFS format(2.0). I read a lot of tutorials and they say that data loss is
> possible (as expected). In order to avoid HDFS data loss I am will probably
> backup all HDFS structure (7TB per node). However, this is a huge amount
> of data and it will take a lot of time in which my service would be
> unavailable.
>
> I was thinking about a simple approach: copying all files to a different
> place.
> I tried to find some parallel files compactor to fasten the process, but
> could
> not find it.
>
> How do you guys did it?
> Is there some trick?
>
> Thank you in advance,
> Pablo Musa


Mime
View raw message