hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetna C <chetna....@gmail.com>
Subject Re: How to backup and Restore Hadoop 2.x ?
Date Wed, 09 Sep 2015 08:22:34 GMT
We recently did a backup from one cluster to another cluster using our
in-house tool Blueshift (https://github.com/flipkart-incubator/BlueShift),
you can try this tool.

Chetna Chaudhari

On 9 September 2015 at 12:12, James Bond <bond.bhai@gmail.com> wrote:

> One way is to create a backup cluster or a secondary cluster.
> 1. Ingest data in both clusters in "parallel", basically run jobs in both
> the clusters. This will kind of help you in backup and also make sure that
> you can switch over to the back up cluster when you have troubles with the
> Primary cluster. This setup usually makes sense when you have 2 Data
> centers with one being Primary DC and the other Backup.
> 2. Have a primary cluster and a secondary which is kept in sync with thr
> primary. Usually distcp type of jobs. Cloudera gives a front end to manage
> this replications but essentially does a distcp in the background.
> 3. If your data ingestion is flume/kafka etc, you can use it to write to
> both Primary/secondary clusters.
> I am not sure if anybody uses a tape/archive to backup a hadoop cluster. I
> guess somebody who does can comment.
> On Wed, Sep 9, 2015 at 11:34 AM, Arthur Chan <arthur.hk.chan@gmail.com>
> wrote:
>> Hi,
>> Any idea how to backup and restore Hadoop 2.x?   Use tape or form a new
>> Hadoop cluster, or any other options?
>> I use Hadoop 2.6 with HBase and Hive
>> Thanks
>> Regards

View raw message