ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Mingolelli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-15165) HDFS Datanode won't start in secure cluster
Date Wed, 24 Feb 2016 23:35:18 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166335#comment-15166335
] 

Sam Mingolelli commented on AMBARI-15165:
-----------------------------------------

I think I figured out this particular issue. This line is key:

{quote}
java.io.IOException: Login failure for dn/host-192-168-114-49.td.local@<REDACTED KERBEROS
REALM> from keytab /etc/security/keytabs/dn.service.keytab: javax.security.auth.login.LoginException:
Unable to obtain password from user
{quote}

For whatever reason this system was identifying itself as 2 different hostnames. I'd used
hostnamectl to explicitly set it but when Ambari + HDP + Kerberos is used it was constructing
a principal using the hostname host-192-168-114-49.td.local. I resolved this issue by explicitly
setting the hostname in my host's /etc/hosts file as well.

Doing a hostname -A showed the offending hostname was still in use, by adding it to /etc/hosts
it placated both the hostname -A and Ambari so that it saw the identical hostname that hostnamectl
was reporting as well.

> HDFS Datanode won't start in secure cluster
> -------------------------------------------
>
>                 Key: AMBARI-15165
>                 URL: https://issues.apache.org/jira/browse/AMBARI-15165
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-web
>    Affects Versions: 2.2.0
>         Environment: {code}
> $ cat /etc/redhat-release
> CentOS Linux release 7.2.1511 (Core)
> $ uname -a
> Linux dev09-ost-hivetest-h-hb02.td.local 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16
17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> {code}
>            Reporter: Sam Mingolelli
>
> This issue sounds related, but I'm on the newer version which should include this patch
already: https://issues.apache.org/jira/browse/AMBARI-12355
> When I attempt to Kerberoize a HDP cluster the startup of the HDFS datanode fails quietly.
Nothing telling in the logs, see the referenced below ambari-agent errors log.
> {code}
> Traceback (most recent call last):
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py",
line 167, in <module>
>     DataNode().execute()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 219, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py",
line 62, in start
>     datanode(action="start")
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py",
line 72, in datanode
>     create_log_dir=True
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py",
line 267, in service
>     Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
158, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
121, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 238, in action_run
>     tries=self.resource.tries, try_sleep=self.resource.try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70,
in inner
>     result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92,
in checked_call
>     tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140,
in _call_wrapper
>     result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291,
in _call
>     raise Fail(err_msg)
> resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh  -H -E /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh
--config /usr/hdp/current/hadoop-client/conf start datanode' returned 1. starting datanode,
logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-dev09-ost-hivetest-h-hb02.td.local.out
> stdout:   /var/lib/ambari-agent/data/output-228.txt
> 2016-02-24 10:51:14,841 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists,
will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:14,841 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:14,841 - call['conf-select create-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr':
-1}
> 2016-02-24 10:51:14,877 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already',
'')
> 2016-02-24 10:51:14,878 - checked_call['conf-select set-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:14,910 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf
-> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:14,910 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:14,910 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,091 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists,
will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:15,091 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:15,091 - call['conf-select create-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr':
-1}
> 2016-02-24 10:51:15,120 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already',
'')
> 2016-02-24 10:51:15,121 - checked_call['conf-select set-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:15,162 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf
-> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:15,162 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:15,162 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,164 - Group['hadoop'] {}
> 2016-02-24 10:51:15,165 - Group['users'] {}
> 2016-02-24 10:51:15,166 - Group['knox'] {}
> 2016-02-24 10:51:15,166 - User['hive'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,167 - User['zookeeper'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,168 - User['ams'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,168 - User['ambari-qa'] {'gid': 'hadoop', 'groups': [u'users']}
> 2016-02-24 10:51:15,169 - User['tez'] {'gid': 'hadoop', 'groups': [u'users']}
> 2016-02-24 10:51:15,170 - User['hdfs'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,171 - User['yarn'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,172 - User['hcat'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,172 - User['mapred'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,173 - User['hbase'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,174 - User['knox'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,175 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content':
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
> 2016-02-24 10:51:15,177 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']
{'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
> 2016-02-24 10:51:15,182 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']
due to not_if
> 2016-02-24 10:51:15,183 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'recursive':
True, 'mode': 0775, 'cd_access': 'a'}
> 2016-02-24 10:51:15,184 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content':
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
> 2016-02-24 10:51:15,185 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase']
{'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
> 2016-02-24 10:51:15,190 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase
/home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if
> 2016-02-24 10:51:15,191 - Group['hdfs'] {'ignore_failures': False}
> 2016-02-24 10:51:15,191 - User['hdfs'] {'ignore_failures': False, 'groups': [u'hadoop',
u'hdfs']}
> 2016-02-24 10:51:15,192 - Directory['/etc/hadoop'] {'mode': 0755}
> 2016-02-24 10:51:15,210 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content':
InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}
> 2016-02-24 10:51:15,211 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir']
{'owner': 'hdfs', 'group': 'hadoop', 'mode': 0777}
> 2016-02-24 10:51:15,224 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce
) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if':
'test -f /selinux/enforce'}
> 2016-02-24 10:51:15,237 - Skipping Execute[('setenforce', '0')] due to not_if
> 2016-02-24 10:51:15,237 - Directory['/var/log/hadoop'] {'owner': 'root', 'mode': 0775,
'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,240 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root',
'recursive': True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,240 - Changing owner for /var/run/hadoop from 1006 to root
> 2016-02-24 10:51:15,240 - Changing group for /var/run/hadoop from 1001 to root
> 2016-02-24 10:51:15,240 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'recursive':
True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,245 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties']
{'content': Template('commons-logging.properties.j2'), 'owner': 'root'}
> 2016-02-24 10:51:15,247 - File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content':
Template('health_check.j2'), 'owner': 'root'}
> 2016-02-24 10:51:15,248 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties']
{'content': ..., 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
> 2016-02-24 10:51:15,259 - File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties']
{'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
> 2016-02-24 10:51:15,260 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties']
{'content': StaticFile('task-log4j.properties'), 'mode': 0755}
> 2016-02-24 10:51:15,261 - File['/usr/hdp/current/hadoop-client/conf/configuration.xsl']
{'owner': 'hdfs', 'group': 'hadoop'}
> 2016-02-24 10:51:15,266 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs',
'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group':
'hadoop'}
> 2016-02-24 10:51:15,271 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'),
'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
> 2016-02-24 10:51:15,467 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists,
will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:15,468 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:15,468 - call['conf-select create-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr':
-1}
> 2016-02-24 10:51:15,501 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already',
'')
> 2016-02-24 10:51:15,501 - checked_call['conf-select set-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:15,534 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf
-> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:15,534 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:15,534 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,536 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists,
will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:15,536 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:15,537 - call['conf-select create-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr':
-1}
> 2016-02-24 10:51:15,565 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already',
'')
> 2016-02-24 10:51:15,566 - checked_call['conf-select set-conf-dir --package hadoop --stack-version
2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:15,595 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf
-> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:15,596 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:15,596 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,605 - Directory['/etc/security/limits.d'] {'owner': 'root', 'group':
'root', 'recursive': True}
> 2016-02-24 10:51:15,612 - File['/etc/security/limits.d/hdfs.conf'] {'content': Template('hdfs.conf.j2'),
'owner': 'root', 'group': 'root', 'mode': 0644}
> 2016-02-24 10:51:15,613 - XmlConfig['hadoop-policy.xml'] {'owner': 'hdfs', 'group': 'hadoop',
'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations':
...}
> 2016-02-24 10:51:15,626 - Generating config: /usr/hdp/current/hadoop-client/conf/hadoop-policy.xml
> 2016-02-24 10:51:15,627 - File['/usr/hdp/current/hadoop-client/conf/hadoop-policy.xml']
{'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding':
'UTF-8'}
> 2016-02-24 10:51:15,638 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop',
'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations':
...}
> 2016-02-24 10:51:15,649 - Generating config: /usr/hdp/current/hadoop-client/conf/ssl-client.xml
> 2016-02-24 10:51:15,650 - File['/usr/hdp/current/hadoop-client/conf/ssl-client.xml']
{'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding':
'UTF-8'}
> 2016-02-24 10:51:15,657 - Directory['/usr/hdp/current/hadoop-client/conf/secure'] {'owner':
'root', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,658 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop',
'conf_dir': '/usr/hdp/current/hadoop-client/conf/secure', 'configuration_attributes': {},
'configurations': ...}
> 2016-02-24 10:51:15,669 - Generating config: /usr/hdp/current/hadoop-client/conf/secure/ssl-client.xml
> 2016-02-24 10:51:15,669 - File['/usr/hdp/current/hadoop-client/conf/secure/ssl-client.xml']
{'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding':
'UTF-8'}
> 2016-02-24 10:51:15,677 - XmlConfig['ssl-server.xml'] {'owner': 'hdfs', 'group': 'hadoop',
'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations':
...}
> 2016-02-24 10:51:15,688 - Generating config: /usr/hdp/current/hadoop-client/conf/ssl-server.xml
> 2016-02-24 10:51:15,689 - File['/usr/hdp/current/hadoop-client/conf/ssl-server.xml']
{'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding':
'UTF-8'}
> 2016-02-24 10:51:15,697 - XmlConfig['hdfs-site.xml'] {'owner': 'hdfs', 'group': 'hadoop',
'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations':
...}
> 2016-02-24 10:51:15,708 - Generating config: /usr/hdp/current/hadoop-client/conf/hdfs-site.xml
> 2016-02-24 10:51:15,709 - File['/usr/hdp/current/hadoop-client/conf/hdfs-site.xml'] {'owner':
'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,770 - XmlConfig['core-site.xml'] {'group': 'hadoop', 'conf_dir':
'/usr/hdp/current/hadoop-client/conf', 'mode': 0644, 'configuration_attributes': {}, 'owner':
'hdfs', 'configurations': ...}
> 2016-02-24 10:51:15,781 - Generating config: /usr/hdp/current/hadoop-client/conf/core-site.xml
> 2016-02-24 10:51:15,782 - File['/usr/hdp/current/hadoop-client/conf/core-site.xml'] {'owner':
'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,810 - File['/usr/hdp/current/hadoop-client/conf/slaves'] {'content':
Template('slaves.j2'), 'owner': 'root'}
> 2016-02-24 10:51:15,811 - Directory['/var/lib/hadoop-hdfs'] {'owner': 'hdfs', 'group':
'hadoop', 'mode': 0751, 'recursive': True}
> 2016-02-24 10:51:15,817 - Host contains mounts: ['/sys', '/proc', '/dev', '/sys/kernel/security',
'/dev/shm', '/dev/pts', '/run', '/sys/fs/cgroup', '/sys/fs/cgroup/systemd', '/sys/fs/pstore',
'/sys/fs/cgroup/perf_event', '/sys/fs/cgroup/memory', '/sys/fs/cgroup/devices', '/sys/fs/cgroup/cpuset',
'/sys/fs/cgroup/hugetlb', '/sys/fs/cgroup/freezer', '/sys/fs/cgroup/blkio', '/sys/fs/cgroup/cpu,cpuacct',
'/sys/fs/cgroup/net_cls', '/sys/kernel/config', '/', '/proc/sys/fs/binfmt_misc', '/dev/mqueue',
'/sys/kernel/debug', '/dev/hugepages', '/run/user/0', '/run/user/1000', '/proc/sys/fs/binfmt_misc'].
> 2016-02-24 10:51:15,817 - Mount point for directory /hadoop/hdfs/data is /
> 2016-02-24 10:51:15,817 - File['/var/lib/ambari-agent/data/datanode/dfs_data_dir_mount.hist']
{'content': '\n# This file keeps track of the last known mount-point for each DFS data dir.\n#
It is safe to delete, since it will get regenerated the next time that the DataNode starts.\n#
However, it is not advised to delete this file since Ambari may\n# re-create a DFS data dir
that used to be mounted on a drive but is now mounted on the root.\n# Comments begin with
a hash (#) symbol\n# data_dir,mount_point\n/hadoop/hdfs/data,/\n', 'owner': 'hdfs', 'group':
'hadoop', 'mode': 0644}
> 2016-02-24 10:51:15,819 - Directory['/var/run/hadoop'] {'owner': 'hdfs', 'group': 'hadoop',
'mode': 0755}
> 2016-02-24 10:51:15,819 - Changing owner for /var/run/hadoop from 0 to hdfs
> 2016-02-24 10:51:15,819 - Changing group for /var/run/hadoop from 0 to hadoop
> 2016-02-24 10:51:15,819 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'recursive':
True}
> 2016-02-24 10:51:15,820 - Directory['/var/log/hadoop/hdfs'] {'owner': 'hdfs', 'recursive':
True}
> 2016-02-24 10:51:15,820 - File['/var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid'] {'action':
['delete'], 'not_if': 'ambari-sudo.sh  -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid
&& ambari-sudo.sh  -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid'}
> 2016-02-24 10:51:15,833 - Deleting File['/var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid']
> 2016-02-24 10:51:15,833 - Execute['ambari-sudo.sh  -H -E /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh
--config /usr/hdp/current/hadoop-client/conf start datanode'] {'environment': {'HADOOP_LIBEXEC_DIR':
'/usr/hdp/current/hadoop-client/libexec'}, 'not_if': 'ambari-sudo.sh  -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid
&& ambari-sudo.sh  -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid'}
> {code}
> When I attempted to run the hdfs ... datanode command directly like so:
> {code}
> strace -s 2000 -o ~/slog.txt /usr/hdp/2.3.4.0-3485/hadoop-hdfs/bin/hdfs --config /usr/hdp/current/hadoop-client/conf
datanode
> {code}
> I noticed this section which mentions to additional log files I hadn't see before.
> {code}
> read(255, "#!/usr/bin/env bash\n\n# Licensed to the Apache Software Foundation (ASF)
under one or more\n# contributor license agreements.  See the NOTICE file distributed with\n#
this work for additional information regarding copyright ownership.\n# The ASF licenses this
file to You under the Apache License, Version 2.0\n# (the \"License\"); you may not use this
file except in compliance with\n# the License.  You may obtain a copy of the License at\n#\n#
    http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or
agreed to in writing, software\n# distributed under the License is distributed on an \"AS
IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n#
See the License for the specific language governing permissions and\n# limitations under the
License.\n\n# Environment Variables\n#\n#   JSVC_HOME  home directory of jsvc binary.  Required
for starting secure\n#              datanode.\n#\n#   JSVC_OUTFILE  path to jsvc output file.
 Defaults to\n#                 $HADOOP_LOG_DIR/jsvc.out.\n#\n#   JSVC_ERRFILE  path to jsvc
error file.  Defaults to $HADOOP_LOG_DIR/jsvc.err.\n\nbin=`which $0`\nbin=`dirname ${bin}`\nbin=`cd
\"$bin\" > /dev/null; pwd`\n\nDEFAULT_LIBEXEC_DIR=\"$bin\"/../libexec\n\nif [ -n \"$HADOOP_HOME\"
]; then\n  DEFAULT_LIBEXEC_DIR=\"$HADOOP_HOME\"/libexec\nfi\n\nHADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}\n.
$HADOOP_LIBEXEC_DIR/hdfs-config.sh\n\nfunction print_usage(){\n  echo \"Usage: hdfs [--config
confdir] [--loglevel loglevel] COMMAND\"\n  echo \"       where COMMAND is one of:\"\n  echo
\"  dfs                  run a filesystem command on the file systems supported in Hadoop.\"\n
 echo \"  classpath            prints the classpath\"\n  echo \"  namenode -format     format
the DFS filesystem\"\n  echo \"  secondarynamenode    run the DFS secondary namenode\"\n 
echo \"  namenode             run the DFS namenode\"\n  echo \"  journalnode          run
the DFS journalnode\"\n  echo \"  zkfc                 run the ZK Failover Controller daemon\"\n
 echo"..., 8192) = 8192
> {code}
> Specifically these files:
> - /var/log/hadoop/hdfs/jsvc.out
> - /var/log/hadoop/hdfs/jsvc.err
> In looking in the jsvc.err file I found this:
> {code}
> STARTUP_MSG:   build = git@github.com:hortonworks/hadoop.git -r ef0582ca14b8177a3cbb6376807545272677d730;
compiled by 'jenkins' on 2015-12-16T03:01Z
> STARTUP_MSG:   java = 1.8.0_60
> ************************************************************/
> 16/02/24 11:30:18 INFO datanode.DataNode: registered UNIX signal handlers for [TERM,
HUP, INT]
> 16/02/24 11:30:18 FATAL datanode.DataNode: Exception in secureMain
> java.io.IOException: Login failure for dn/host-192-168-114-49.td.local@<REDACTED KERBEROS
REALM> from keytab /etc/security/keytabs/dn.service.keytab: javax.security.auth.login.LoginException:
Unable to obtain password from user
>         at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:962)
>         at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:275)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2296)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2345)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2526)
>         at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:76)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
> Caused by: javax.security.auth.login.LoginException: Unable to obtain password from user
>         at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897)
>         at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760)
>         at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
>         at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
>         at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
>         at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
>         at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
>         at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:953)
>         ... 10 more
> 16/02/24 11:30:18 INFO util.ExitUtil: Exiting with status 1
> 16/02/24 11:30:18 INFO datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message