ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Robertson (JIRA)" <>
Subject [jira] [Created] (AMBARI-12995) Ambari alerts reports "UNKNOWN" error for secondary YARN RM and NM in a kerberoized YARN HA deployment
Date Thu, 03 Sep 2015 13:45:45 GMT
Andrew Robertson created AMBARI-12995:

             Summary: Ambari alerts reports "UNKNOWN" error for secondary YARN RM and NM in
a kerberoized YARN HA deployment
                 Key: AMBARI-12995
             Project: Ambari
          Issue Type: Bug
          Components: alerts
    Affects Versions: 2.1.1
         Environment: Requires YARN HA with Kerberos
            Reporter: Andrew Robertson
             Fix For: 2.1.2

What is observed:

On my currently active YARN NodeManager and ResourceManager, Ambari
alerts are fine.

On the secondary YARN NodeManager and ResourceManager, Ambari reports
"Status: Unknown" / "HTTP 200 response (metrics unavailable)".  This
is for the alerts:
 - NodeManager Health Summary
 - ResourceManager CPU Utilization
 - ResourceManager RPC Latency

The Ambari web interface does not make this error obvious, as it says
"0 alerts" in the top bar. But you can see the alerts with "unknown"
status when you go to the ambari alerts page, or if you query the
alerts API.

What is expected:
Ambari alerts does not generate any alarms on a secondary YARN HA node as long as the node
is responsive.

A network dump of the ambari poll against the secondary RM looks like:

GET /jmx?qry=Hadoop:service=ResourceManager,name=RMNMInfo HTTP/1.1

HTTP/1.1 200 OK
Refresh: 3; url=http://{my-primary-rm}:8088/jmx
Content-Length: 106
Server: Jetty(6.1.26.hwx)

This is standby RM. Redirecting to the current active RM:

I'm also filing a JIRA against YARN (per request from jhurley) and will post that info here.

Comment from Jonathan Hurley

This is caused by how YARN does HA mode. With two YARN RMs, the standby RM returns a 200 response
with a JavaScript redirect instead of an 3xx redirection. When not using Kerberos, Ambari
should be able to parse the headers and follow the JS-based redirect. However, on a Kerberized
cluster, we use curl which cannot do this. Therefore, requests against the secondary RM will
return an UNKNOWN response since it did get a 200. I think a few things can be improved here:

1) There should be a ticket filed for YARN to have their HA mode use a proper redirect
2) Ambari might not want to produce an UNKNOWN response here since it gives a false feeling
that something went wrong.

This message was sent by Atlassian JIRA

View raw message