ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-20349) When SPNEGO authentication is enabled for Hadoop in a cluster with NN HA, PXF Process alert fails
Date Thu, 09 Mar 2017 16:07:38 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903285#comment-15903285
] 

Hudson commented on AMBARI-20349:
---------------------------------

SUCCESS: Integrated in Jenkins build Ambari-branch-2.5 #1221 (See [https://builds.apache.org/job/Ambari-branch-2.5/1221/])
AMBARI-20349. When SPNEGO authentication is enabled for Hadoop in a (rlevas: [http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=245fd5cf3689097fdcdde6a75c39de9d38e0bde8])
* (edit) ambari-server/src/main/resources/common-services/PXF/3.0.0/package/alerts/api_status.py


> When SPNEGO authentication is enabled for Hadoop in a cluster with NN HA, PXF Process
alert fails
> -------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-20349
>                 URL: https://issues.apache.org/jira/browse/AMBARI-20349
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.2.2
>            Reporter: Robert Levas
>            Assignee: Robert Levas
>              Labels: PHD, PXF, kerberos
>             Fix For: 2.5.0
>
>         Attachments: AMBARI-20349_branch-2.5_01.patch, AMBARI-20349_trunk_01.patch
>
>
> When SPNEGO authentication is enabled for Hadoop in a cluster where NN HA is enabled,
PXF Process alert fails with the following errors in the ambari-agent.log file 
> {noformat}
> ERROR 2017-03-07 18:03:58,417 jmx.py:44 - Getting jmx metrics from NN failed. URL: http://c6401.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesy
> stem
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py",
line 41, in get_value_from_jmx
>     data_dict = json.loads(data)
>   File "/usr/lib/python2.6/site-packages/ambari_simplejson/__init__.py", line 307, in
loads
>     return _default_decoder.decode(s)
>   File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 335, in
decode
>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>   File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 353, in
raw_decode
>     raise ValueError("No JSON object could be decoded")
> ValueError: No JSON object could be decoded
> INFO 2017-03-07 18:04:02,769 logger.py:71 - call['ambari-sudo.sh su hdfs -l -s /bin/bash
-c 'curl --negotiate -u : -s '"'"'http://c6402.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"'
1>/tmp/tmphTXg76 2>/tmp/tmp5bm2nM''] {'quiet': False}
> INFO 2017-03-07 18:04:02,797 logger.py:71 - call returned (0, '')
> ERROR 2017-03-07 18:04:02,798 jmx.py:44 - Getting jmx metrics from NN failed. URL: http://c6402.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py",
line 41, in get_value_from_jmx
>     data_dict = json.loads(data)
>   File "/usr/lib/python2.6/site-packages/ambari_simplejson/__init__.py", line 307, in
loads
>     return _default_decoder.decode(s)
>   File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 335, in
decode
>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>   File "/usr/lib/python2.6/site-packages/ambari_simplejson/decoder.py", line 353, in
raw_decode
>     raise ValueError("No JSON object could be decoded")
> ValueError: No JSON object could be decoded
> {noformat}
> *Cause*
> During the test for the {{PXF Process}} alert, the Active NN is found using a JMX call.
 This call requires SPNEGO authentication since SPNEGO authentication is turned on for the
Hadoop web interfaces. However, a valid Kerberos ticket is not found in the configured user's
Kerberos ticket cache. In this case, the configured users is the HDFS user - which technically
is not necessary. 
> This occurs in 
> {code:title=common-services/PXF/3.0.0/package/alerts/api_status.py:137}
>     if CLUSTER_ENV_SECURITY in configurations and configurations[CLUSTER_ENV_SECURITY].lower()
== "true":
>       if 'dfs.nameservices' in configurations[HDFS_SITE]:
>         namenode_address = get_active_namenode(ConfigDictionary(configurations[HDFS_SITE]),
configurations[CLUSTER_ENV_SECURITY], configurations[HADOOP_ENV_HDFS_USER])[1]
>       else:
>         namenode_address = configurations[HDFS_SITE]['dfs.namenode.http-address']
>       token = _get_delegation_token(namenode_address,
>                                      configurations[HADOOP_ENV_HDFS_USER],
>                                      configurations[HADOOP_ENV_HDFS_USER_KEYTAB],
>                                      configurations[HADOOP_ENV_HDFS_PRINCIPAL_NAME],
>                                      None)
>       commonPXFHeaders.update({"X-GP-TOKEN": token})
> {code}
> Inside the call at 
> {code}
> namenode_address = get_active_namenode(ConfigDictionary(configurations[HDFS_SITE]), configurations[CLUSTER_ENV_SECURITY],
configurations[HADOOP_ENV_HDFS_USER])[1]
> {code}
> *Solution*
> Ensure the configured user's Kerberos ticket cache contains a valid ticket before querying
for the active NN. Possibly change the acting user to one executing the PXF component. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message