ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez" <afernan...@hortonworks.com>
Subject Re: Review Request 31002: RU - NodeManager failed to restart in Kerberized clusters
Date Fri, 13 Feb 2015 17:19:20 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31002/
-----------------------------------------------------------

(Updated Feb. 13, 2015, 5:19 p.m.)


Review request for Ambari, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Robert Levas.


Bugs: AMBARI-9627
    https://issues.apache.org/jira/browse/AMBARI-9627


Repository: ambari


Description
-------

Node Manager failed to restart in a Kerberized cluster while performing a Rolling Upgrade.

I deployed a 3-node cluster with all services from HDDFS through ZK, then enabled Namenode
HA, and kerberized the cluster.

When I performed a RU from 2.2.0.0 GA bits to 2.2.1.0-2260, I first had to comment out an
error in ZK server, and when I got to the Slaves group, NodeManager failed. See attached log.

```
Fail: Execution of '/usr/bin/kinit -kt /etc/security/keytabs/nm.service.keytab nm/_HOST@EXAMPLE.COM;'
returned 1. kinit: Keytab contains no suitable keys for nm/_HOST@EXAMPLE.COM while getting
initial credentials
```

```
[root@c6404 ~]# klist -kt /etc/security/keytabs/nm.service.keytab
Keytab name: FILE:/etc/security/keytabs/nm.service.keytab
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   1 02/12/15 23:23:29 nm/c6404.ambari.apache.org@EXAMPLE.COM
   1 02/12/15 23:23:29 nm/c6404.ambari.apache.org@EXAMPLE.COM
   1 02/12/15 23:23:29 nm/c6404.ambari.apache.org@EXAMPLE.COM
   1 02/12/15 23:23:29 nm/c6404.ambari.apache.org@EXAMPLE.COM
   1 02/12/15 23:23:29 nm/c6404.ambari.apache.org@EXAMPLE.COM
```

This means that params.py is probably missing to replace _HOST with the value.


Diffs
-----

  ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params.py
53beb96 

Diff: https://reviews.apache.org/r/31002/diff/


Testing (updated)
-------

Deployed a 3-node cluster with HDFS, YARN, ..., ZK, then added Namenode HA, and kerberized
the cluster. After performing a RU, I verified that the fix for NodeManager worked.

Unit tests passed in ABO.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:13 h
[INFO] Finished at: 2015-02-13T17:11:56+00:00
[INFO] Final Memory: 44M/475M
[INFO] ------------------------------------------------------------------------


Thanks,

Alejandro Fernandez


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message