ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek Singh Raghuwanshi <vivekraghuwan...@gmail.com>
Subject Re: Hadoop Backup and Archival Cluster
Date Wed, 10 Feb 2016 20:46:22 GMT
Are we having any cluster replication feature in Ambari for data across
data centers for disaster recovery scenarios.
Like Cloudera which include data stored in HDFS, data stored in Hive
tables, Hive metastore data, and Impala metadata (catalog server metadata)
associated with Impala tables registered in the Hive metastore

Sorry for comparing Ambari with Cloudera i am just searching for some handy
feature in ambari for our customers.

Regards







On Wed, Feb 10, 2016 at 1:37 PM, Benoit Perroud <benoit@noisette.ch> wrote:

> We're using Trumpet (http://verisign.github.io/trumpet/), a iNotify-like
> for HDFS, as the fondation of such replication inter-cluster replication.
> In a nutshell, every new fiels created in Cluster A does notify a
> replication system, which copy the file to cluster B (see
> https://github.com/verisign/trumpet/blob/master/examples/src/main/java/com/verisign/vscc/hdfs/trumpet/client/example/TestApp.java
for
> an example)
> For keeping Hive partitions in sync,
> https://github.com/daplab/hive-auto-partitioner should make it (also
> relies on Trumpet).
>
> Benoit
>
> On Wed, Feb 10, 2016 at 7:37 PM David Whitmore <
> David.Whitmore@catalinamarketing.com> wrote:
>
>> Vivek,
>>
>>
>>
>> You are correct, distcp will overwrite a file if it has changed or is new.
>>
>> As to running this realtime (ie: as soon as data is deposited on the
>> source cluster, you will have to handle that).
>>
>> Please be aware if you are talking about hive tables, you will also need
>> the hive metastore.
>>
>> We copy our critical data from a Production Cluster to another Production
>> Cluster and to a Test Cluster on a daily basis.
>>
>> Also, the contents of the Hive Metastore database.
>>
>> Be aware if you restore the Hive Metastore database on the destination
>> cluster, any tables created solely on the destination cluster may disappear.
>>
>>
>>
>> David
>>
>>
>>
>>
>>
>> *From:* Vivek Singh Raghuwanshi [mailto:vivekraghuwanshi@gmail.com]
>> *Sent:* Wednesday, February 10, 2016 1:28 PM
>> *To:* user@ambari.apache.org
>> *Subject:* Re: Hadoop Backup and Archival Cluster
>>
>>
>>
>> Thanks David,
>>
>>
>>
>> I want to replicate the data once it reached on the cluster, and delete
>> from source Cluster after one year. I want Cluster works as Hot Backup and
>> Archival and Cluster A only having latest data.
>>
>>
>>
>> And as per my information distcp copy all the data and over-right. Please
>> correct me if i am wrong.
>>
>>
>>
>>
>>
>> On Wed, Feb 10, 2016 at 12:21 PM, David Whitmore <
>> David.Whitmore@catalinamarketing.com> wrote:
>>
>> Yes, you can run a distcp to copy data from one cluster to another, also
>> distcp has an option to tell if it will delete files on the destination if
>> they are NOT on the source.
>>
>>
>>
>>
>>
>> *From:* Vivek Singh Raghuwanshi [mailto:vivekraghuwanshi@gmail.com]
>> *Sent:* Wednesday, February 10, 2016 1:16 PM
>> *To:* user@ambari.apache.org
>> *Subject:* Hadoop Backup and Archival Cluster
>>
>>
>>
>>
>> Hi Friends,
>>
>>
>>
>> I am planning to setup a Hadoop Cluster (A) with Cluster replication (B).
>> so that once data is reached to Cluster A it will replicated to Cluster D.
>> I am having one question if i delete data from Cluster A on the basis of
>> Time like one month old data is it also removed from Cluster B. if yes how
>> i can avoid this.
>>
>> What i want to achieve.
>>
>> 1. Once data is reached to Cluster A it will automatically replicated to
>> Cluster B.
>>
>> 2. After one year old data from Cluster A remove automatically but not
>> from Cluster B.
>>
>> 3. If any one wants to run query on latest data Cluster A is available
>> but for Older data Cluster B is available.
>>
>>
>>
>>
>>
>> Regards
>>
>> --
>>
>> ViVek Raghuwanshi
>> Mobile -+91-09595950504
>> Skype - vivek_raghuwanshi
>> IRC vivekraghuwanshi
>> http://vivekraghuwanshi.wordpress.com/
>> http://in.linkedin.com/in/vivekraghuwanshi
>>
>>
>>
>>
>>
>> --
>>
>> ViVek Raghuwanshi
>> Mobile -+91-09595950504
>> Skype - vivek_raghuwanshi
>> IRC vivekraghuwanshi
>> http://vivekraghuwanshi.wordpress.com/
>> http://in.linkedin.com/in/vivekraghuwanshi
>>
>


-- 
ViVek Raghuwanshi
Mobile -+91-09595950504
Skype - vivek_raghuwanshi
IRC vivekraghuwanshi
http://vivekraghuwanshi.wordpress.com/
http://in.linkedin.com/in/vivekraghuwanshi

Mime
View raw message