hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kumar Vavilapalli <vino...@hortonworks.com>
Subject Re: YARN HA Active ResourceManager failover when machine is stopped
Date Thu, 23 Apr 2015 22:25:34 GMT
I have run into this offline with someone else too but couldn't root-cause it.

Will you be able to share your active/standby ResourceManager logs via pastebin or something?


On Apr 23, 2015, at 9:41 AM, Matt Narrell <matt.narrell@gmail.com<mailto:matt.narrell@gmail.com>>

I’m using Hadoop 2.6.0 from HDP 2.2.4 installed via Ambari 2.0

I’m testing the YARN HA ResourceManager failover. If I STOP the active ResourceManager (shut
the machine off), the standby ResourceManager is elected to active, but the NodeManagers do
not register themselves with the newly elected active ResourceManager. If I restart the machine
(but DO NOT resume the YARN services) the NodeManagers register with the newly elected ResourceManager
and my jobs resume. I assume I have some bad configuration, as this produces a SPOF, and is
not HA in the sense I’m expecting.


View raw message