ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cs user <>
Subject Re: Recovering from a dead master namenode server
Date Fri, 04 Mar 2016 09:26:21 GMT
Hi Alejandro,

Many thanks for getting back to me. I'm currently trying to configure ha
for the namenode and yarn resource manager, which should help if we lost a
node at some point. I'm using a blueprint to bootstrap my cluster.

If I use the blueprint without ha enabled, start the cluster, and then
enable ha for both components, after following the setup process everything
works fine. At this point I have exported the cluster blueprint and then
attempted to re-create this cluster with ha configured from the start.

However the install appears to fail. Should this be possible? I noticed
that when I enabled ha, I had to follow a number of manual steps. Is it
possible to have HA configured from the start with a blueprint?


On Thu, Mar 3, 2016 at 7:00 PM, Alejandro Fernandez <> wrote:

> The situation is the same regardless of master/slave/client.
> Basically, you have to
> bring up a host with the same FQDN
> install ambari-agent on it
> At this point, any components that used to be on that host will report
> heartbeat lost and the cluster may not be fully operational if it contained
> masters (especially NameNode).
> You may then have to restart services on that host, which will actually
> end up installing the bits again and generating configs.
> The hard part is that you may have to run additional commands depending on
> the type of master, think of NameNode or even hosts that contain databases
> for Hive, Oozie, etc.
> Attempting to move masters may be complicated because it may require the
> original host to be heartbeating and with the bits installed in order to be
> able to stop the services Ambari knows about.
> Thanks,
> Alejandro
> From: cs user <>
> Reply-To: "" <>
> Date: Thursday, March 3, 2016 at 5:00 AM
> To: "" <>
> Subject: Recovering from a dead master namenode server
> Hi All,
> I'm trying to understand how to recover from certain failures within
> Ambari. When launching within a cloud environment, it's possible that a
> host may be completed deleted, and you won't have the chance to
> decommission the node.
> For example, in the event that the server hosting the master hdfs namenode
> was lost, would it be possible to spin up another server in its place,
> built completely from scratch and have this replace the old namenode master?
> Currently when I attempt to delete a failed host, it warns me that the
> following components need to be moved:
> NameNode, Spark History Server
> It also then tries to talk me through the process of copying data from the
> old namenode to the new namenode. If the server has been deleted, this
> would not be possible. Would it be possible to copy this data from the
> secondary namenode instead?
> Many thanks in advance.
> Cheers!

View raw message