Return-Path: X-Original-To: apmail-ambari-user-archive@www.apache.org Delivered-To: apmail-ambari-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 445171751B for ; Sat, 27 Jun 2015 14:04:49 +0000 (UTC) Received: (qmail 98258 invoked by uid 500); 27 Jun 2015 14:04:48 -0000 Delivered-To: apmail-ambari-user-archive@ambari.apache.org Received: (qmail 98227 invoked by uid 500); 27 Jun 2015 14:04:48 -0000 Mailing-List: contact user-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ambari.apache.org Delivered-To: mailing list user@ambari.apache.org Received: (qmail 98214 invoked by uid 99); 27 Jun 2015 14:04:48 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 27 Jun 2015 14:04:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 67C86180339 for ; Sat, 27 Jun 2015 14:04:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.002 X-Spam-Level: *** X-Spam-Status: No, score=3.002 tagged_above=-999 required=6.31 tests=[FSL_HELO_BARE_IP_2=0.001, HTML_MESSAGE=3, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id eSOkhXM-3Gvn for ; Sat, 27 Jun 2015 14:04:37 +0000 (UTC) Received: from relayvx12c.securemail.intermedia.net (relayvx12c.securemail.intermedia.net [64.78.52.187]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id B8154205B7 for ; Sat, 27 Jun 2015 14:04:36 +0000 (UTC) Received: from securemail.intermedia.net (localhost [127.0.0.1]) by emg-ca-1-2.localdomain (Postfix) with ESMTP id 2CA1F53E29 for ; Sat, 27 Jun 2015 07:04:14 -0700 (PDT) Subject: Re: Ambari data corruption/recovery process MIME-Version: 1.0 x-echoworx-emg-received: Sat, 27 Jun 2015 07:04:14.167 -0700 x-echoworx-msg-id: 5065033d-c618-4b20-bb64-4fd8756a969f x-echoworx-action: delivered Received: from 10.254.155.17 ([10.254.155.17]) by emg-ca-1-2 (JAMES SMTP Server 2.3.2) with SMTP ID 656 for ; Sat, 27 Jun 2015 07:04:14 -0700 (PDT) Received: from MBX080-W4-CO-1.exch080.serverpod.net (unknown [10.224.117.101]) by emg-ca-1-2.localdomain (Postfix) with ESMTP id E367353E29 for ; Sat, 27 Jun 2015 07:04:13 -0700 (PDT) Received: from MBX080-W4-CO-1.exch080.serverpod.net (10.224.117.101) by MBX080-W4-CO-1.exch080.serverpod.net (10.224.117.101) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Sat, 27 Jun 2015 07:04:13 -0700 Received: from MBX080-W4-CO-1.exch080.serverpod.net ([10.224.117.101]) by mbx080-w4-co-1.exch080.serverpod.net ([10.224.117.101]) with mapi id 15.00.1044.021; Sat, 27 Jun 2015 07:04:13 -0700 From: Jeff Sposetti To: "user@ambari.apache.org" Thread-Topic: Ambari data corruption/recovery process Thread-Index: AQHQsG242NmvmR5VfUmIB8ClJjhgY52/8O2AgAACPICAAAiMAIAAa0AAgAAuvYA= Date: Sat, 27 Jun 2015 14:04:12 +0000 Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [96.227.180.166] x-source-routing-agent: Processed Content-Type: multipart/alternative; boundary="_000_D1B4249F364D9jeffhortonworkscom_" --_000_D1B4249F364D9jeffhortonworkscom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, (Others... please add/correct if I missed something). I believe the keys are unrelated to whether agent is bootstrapped with SSH = or manual. There will be keys on the agents if the ambari server-agent comm= unication was setup for two-way ssl. This is not set by default in Ambari S= erver ambari.properties. If enabled, you have this in the ambari.properties= file. security.server.two_way_ssl=3Dtrue So if two-way ssl is not enabled, the keys folder is empty on the agent hos= ts (and there is nothing to delete). If enabled, then yep, you have to clea= r that folder so when the agent checks-in with the replacement Ambari Serve= r, the keys will get re-created to work with the new Ambari Server. Cheers, Jeff From: Alex Kaplan > Reply-To: "user@ambari.apache.org" > Date: Saturday, June 27, 2015 at 3:16 AM To: "user@ambari.apache.org" > Subject: Re: Ambari data corruption/recovery process Is removing that directory necessary for agents that registered without ssh= ? On Jun 26, 2015 5:53 PM, "Yusaku Sako" > wrote: Yes, if you are talking about corruption, then you would need snapshots to = go back to. Recovery would be simpler if the Ambari Server hostname does not change (IP= address changes should not matter). One more step that I forgot to mention... you would need to delete /var/li= b/ambari-agent/keys/* from each agent before restarting it. Yusaku From: Clark Breyman > Reply-To: "user@ambari.apache.org" > Date: Friday, June 26, 2015 5:22 PM To: "user@ambari.apache.org" > Subject: Re: Ambari data corruption/recovery process Thanks Yusaku for the quick response. For our production systems, we're planning on using Postgres replication to= ensure backups, though that doesn't defend against data corruption. Perhap= s snapshots will be required. Is there any documentation on restoring to a newly provisioned host? Is the= re any reason to use an DNS A record instead of a CNAME alias to simplify t= he recovery process? On Fri, Jun 26, 2015 at 5:14 PM, Yusaku Sako > wrote: Ambari DB should be backed up on a regular basis. This is the most importa= nt piece of information. It is also advisable to also back up /etc/ambari-server/conf/ambari-server.= properties. If you have these two, you can restore Ambari Server back to a running cond= ition on a different host. If the hostname of the Ambari Server changes, then you would have to update= /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari Server = hostname and restart the agent. Yusaku From: Clark Breyman > Reply-To: "user@ambari.apache.org" > Date: Friday, June 26, 2015 5:10 PM To: "user@ambari.apache.org" > Subject: Ambari data corruption/recovery process I'm wondering if anyone can share pointers/procedures/best practices to han= dle the scenarios where: a) The sql database becomes corrupt. (Bugs, ...) b) The Ambari service host is lost (e.g. EC2 instance termination, physical= hardware loss, ...) --_000_D1B4249F364D9jeffhortonworkscom_ Content-Type: text/html; charset="iso-8859-1" Content-ID: Content-Transfer-Encoding: quoted-printable
Hi,

(Others… please add/correct if I missed something).

I believe the keys are unrelated to whether agent is bootstrapped with= SSH or manual. There will be keys on the agents if the ambari server-agent= communication was setup for two-way ssl. This is not set by default in Amb= ari Server ambari.properties. If enabled, you have this in the ambari.properties file.

security.server.two_way_ssl=3Dtrue

So if two-way ssl is not enabled, the keys folder is empty on the agen= t hosts (and there is nothing to delete). If enabled, then yep, you have to= clear that folder so when the agent checks-in with the replacement Ambari = Server, the keys will get re-created to work with the new Ambari Server.

Cheers,

Jeff

From: Alex Kaplan <akaplan@ifwe.co>
Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
Date: Saturday, June 27, 2015 at 3:= 16 AM
To: "user@ambari.apache.org" <user@ambari.apache.org>
Subject: Re: Ambari data corruption= /recovery process

Is removing that directory necessary for agents that registe= red without ssh?

On Jun 26, 2015 5:53 PM, "Yusaku Sako"= <yusaku@hortonworks.com&g= t; wrote:
Yes, if you are talking about corruption, then you would need snapshot= s to go back to.
Recovery would be simpler if the Ambari Server hostname does not chang= e (IP address changes should not matter).

One more step that I forgot to mention…  you would need to = delete /var/lib/ambari-agent/keys/* from each agent before restarting it.

Yusaku

From: Clark Breyman <clark@breyman.com>
Reply-To: "user@ambari.apache.org" &= lt;user@ambari.= apache.org>
Date: Friday, June 26, 2015 5:22 PM=
To: "user@ambari.apache.org" <user@ambari.apache= .org>
Subject: Re: Ambari data corruption= /recovery process

Thanks Yusaku for the quick response. 

For our production systems, we're planning on using Postgres replicati= on to ensure backups, though that doesn't defend against data corruption. P= erhaps snapshots will be required. 
Is there any documentation on restoring to a newly provisioned host? I= s there any reason to use an DNS A record instead of a CNAME alias to simpl= ify the recovery process?


On Fri, Jun 26, 2015 at 5:14 PM, Yusaku Sako <yusaku@hort= onworks.com> wrote:
Ambari DB should be backed up on a regular basis.  This is the mo= st important piece of information.
It is also advisable to also back up /etc/ambari-server/conf/ambari-se= rver.properties.
If you have these two, you can restore Ambari Server back to a running= condition on a different host.
If the hostname of the Ambari Server changes, then you would have to u= pdate /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari Se= rver hostname and restart the agent.

Yusaku

From: Clark Breyman <clark@breyman.com>
Reply-To: "user@ambari.apache.org" &= lt;user@ambari.= apache.org>
Date: Friday, June 26, 2015 5:10 PM=
To: "user@ambari.apache.org" <user@ambari.apache= .org>
Subject: Ambari data corruption/rec= overy process

I'm wondering if anyone can share pointers/procedures/best= practices to handle the scenarios where:

a) The sql database becomes corrupt. (Bugs, ...)
b) The Ambari service host is lost (e.g. EC2 instance termination, phy= sical hardware loss, ...)


--_000_D1B4249F364D9jeffhortonworkscom_--