ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aravindan Vijayan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-12376) False Ambari alerts after Ambari server reboot on secured cluster
Date Thu, 21 Jan 2016 19:39:39 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aravindan Vijayan updated AMBARI-12376:
---------------------------------------
    Component/s:     (was: ambari-metrics)

> False Ambari alerts after Ambari server reboot on secured cluster
> -----------------------------------------------------------------
>
>                 Key: AMBARI-12376
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12376
>             Project: Ambari
>          Issue Type: Bug
>    Affects Versions: 2.1.0
>            Reporter: Dave Disser
>
> HDP 2.3 cluster with Ambari 2.1 build #1319
> Cluster with HA Namenode, HA ResourceManager, HA Oozie, several other HA services installed
via blueprint.
> After rebooting Ambari server host (which also has NN, ZK, JN instances), several Ambari
alerts persist in this form:
> Percent NodeManagers Available:
> affected: [1], total: [3]
> NodeManager Health :
> Connection failed to http://roller4:8042/ws/v1/node/info (Execution of '/usr/bin/kinit
-l 5m -c /var/lib/ambari-agent/data/tmp/nm_health_alert_cc_14246ce5caacfc93af574dc4b896debd
-kt /etc/security/keytabs/spnego.service.keytab HTTP/roller4@VM6C1.HADOOP.COM > /dev/null'
returned 1. kinit(v5): Cannot contact any KDC for realm 'VM6C1.HADOOP.COM' while getting initial
credentials)
> NodeManager Web UI:
> Connection failed to http://roller5:8042 (Execution of '/usr/bin/kinit -l 5m -c /var/lib/ambari-agent/data/tmp/web_alert_cc_866ff322618d226db66f6f893a512256
-kt /etc/security/keytabs/spnego.service.keytab HTTP/roller5@VM6C1.HADOOP.COM > /dev/null'
returned 1. kinit(v5): Cannot contact any KDC for realm 'VM6C1.HADOOP.COM' while getting initial
credentials)
> (some fqdns redacted)
> Failures are not consistent from test to test, but persist until ambari-server and ambari-agent
are restarted on all nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message