ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jaehoon ko (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMBARI-7753) DataNode decommision error in secured cluster
Date Mon, 13 Oct 2014 01:56:33 GMT
jaehoon ko created AMBARI-7753:
----------------------------------

             Summary: DataNode decommision error in secured cluster
                 Key: AMBARI-7753
                 URL: https://issues.apache.org/jira/browse/AMBARI-7753
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server, stacks
         Environment: Ambari-1.6.1 with HDP-2.1.5
            Reporter: jaehoon ko


Decommissioning a DataNode from a secured cluster returns errors with the following messages

{code}
STDERR: 
2014-10-13 10:37:31,896 - Error while executing command 'decommission':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 111, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/namenode.py",
line 66, in decommission
    namenode(action="decommission")
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py",
line 70, in namenode
    decommission()
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py",
line 145, in 

decommission
    user=hdfs_user
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149,
in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115,
in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line
239, in action_run
    raise ex
Fail: Execution of '/usr/bin/kinit -kt /etc/security/keytabs/dn.service.keytab dn/master-

6.amber.gbcl.net@AMBER.GBCLUSTER.NET;' returned 1. kinit: Client not found in Kerberos database
while getting initial 

credentials
{code}

{code}
STDOUT:
2014-10-13 10:37:31,793 - File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content':
Template

('exclude_hosts_list.j2'), 'group': 'hadoop'}
2014-10-13 10:37:31,796 - Writing File['/etc/hadoop/conf/dfs.exclude'] because contents don't
match
2014-10-13 10:37:31,797 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/dn.service.keytab
dn/master-

6.amber.gbcl.net@AMBER.GBCLUSTER.NET;'] {'user': 'hdfs'}
2014-10-13 10:37:31,896 - Error while executing command 'decommission':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 111, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/namenode.py",
line 66, in decommission
    namenode(action="decommission")
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py",
line 70, in namenode
    decommission()
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py",
line 145, in 

decommission
    user=hdfs_user
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149,
in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115,
in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line
239, in action_run
    raise ex
Fail: Execution of '/usr/bin/kinit -kt /etc/security/keytabs/dn.service.keytab dn/master-

6.amber.gbcl.net@AMBER.GBCLUSTER.NET;' returned 1. kinit: Client not found in Kerberos database
while getting initial 

credentials
{code}

The reason is that Ambar-agent uses DataNode principal to perform HDFS refresh, which should
be done as NameNode. This error can be solved by letting Ambari-agent uses NameNode kerberos
principal and keytab. Note that [AMBARI-5729|https://issues.apache.org/jira/browse/AMBARI-5729]
solves similar issue for NodeManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message