ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-10317) Knox gateway fails to restart on Ubuntu 12.04 after system restart because /var/run/knox is deleted
Date Thu, 02 Apr 2015 22:02:52 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393549#comment-14393549
] 

Alejandro Fernandez commented on AMBARI-10317:
----------------------------------------------

Thanks for the pointer, I have a code review ready so I can commit this to trunk. https://reviews.apache.org/r/32791/

> Knox gateway fails to restart on Ubuntu 12.04 after system restart because /var/run/knox
is deleted
> ---------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-10317
>                 URL: https://issues.apache.org/jira/browse/AMBARI-10317
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.0.0, 2.1.0
>         Environment: ubuntu 12.04
>            Reporter: David McWhorter
>            Assignee: Alejandro Fernandez
>             Fix For: 2.1.0
>
>         Attachments: AMBARI-10317.patch, AMBARI-10317.v2.patch, AMBARI-10317_branch-2.0.0.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> We are testing deploying an HDP 2.2. Cluster using ambari 2.0.0-rc2 running on ubuntu
12.04.  I’ve been able to set up a cluster running HDFS, MapReduce2, YARN, Zookeeper, Knox,
Ranger, and Ambari Metrics.  When I shut down the whole cluster using Actions -> Stop All
in Ambari, reboot the hosts, and then try to restart the cluster I see the error below restarting
the Knox gateway.  The directory /var/run/knox is indeed missing on the master host.
> Knox Gateway startup log:
> 2015-04-01 16:17:12,075 - Error while executing command 'start':
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
>     method(env)
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 80, in start
>     self.configure(env)
>   File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 64, in configure
>     knox()
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py",
line 99, in knox
>     sudo = True,
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
152, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
118, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 274, in action_run
>     raise ex
> Fail: Execution of 'chown -R knox:knox /var/lib/knox/data /var/log/knox /var/log/knox
/var/run/knox /etc/knox/conf' returned 1. chown: cannot access `/var/run/knox': No such file
or directory
> stdout:   /var/lib/ambari-agent/data/output-107.txt
> 2015-04-01 16:17:06,744 - u"Group['hadoop']" {'ignore_failures': False}
> 2015-04-01 16:17:06,744 - Modifying group hadoop
> 2015-04-01 16:17:06,797 - u"Group['users']" {'ignore_failures': False}
> 2015-04-01 16:17:06,797 - Modifying group users
> 2015-04-01 16:17:06,839 - u"Group['knox']" {'ignore_failures': False}
> 2015-04-01 16:17:06,839 - Modifying group knox
> 2015-04-01 16:17:06,886 - u"Group['ranger']" {'ignore_failures': False}
> 2015-04-01 16:17:06,886 - Modifying group ranger
> 2015-04-01 16:17:06,930 - u"User['mapred']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:06,930 - Modifying user mapred
> 2015-04-01 16:17:06,976 - u"User['root']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:06,977 - Modifying user root
> 2015-04-01 16:17:07,019 - u"User['ambari-qa']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'users']}
> 2015-04-01 16:17:07,020 - Modifying user ambari-qa
> 2015-04-01 16:17:07,066 - u"User['zookeeper']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:07,066 - Modifying user zookeeper
> 2015-04-01 16:17:07,109 - u"User['rangerlogger']" {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
> 2015-04-01 16:17:07,110 - Modifying user rangerlogger
> 2015-04-01 16:17:07,152 - u"User['hdfs']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:07,152 - Modifying user hdfs
> 2015-04-01 16:17:07,195 - u"User['knox']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:07,195 - Modifying user knox
> 2015-04-01 16:17:07,238 - u"User['ranger']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:07,238 - Modifying user ranger
> 2015-04-01 16:17:07,282 - u"User['yarn']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:07,283 - Modifying user yarn
> 2015-04-01 16:17:07,326 - u"User['ams']" {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
> 2015-04-01 16:17:07,327 - Modifying user ams
> 2015-04-01 16:17:07,370 - u"User['rangeradmin']" {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
> 2015-04-01 16:17:07,370 - Modifying user rangeradmin
> 2015-04-01 16:17:07,413 - u"File['/var/lib/ambari-agent/data/tmp/changeUid.sh']" {'content':
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
> 2015-04-01 16:17:07,686 - u"Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']"
{'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
> 2015-04-01 16:17:07,728 - Skipping u"Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh
ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']"
due to not_if
> 2015-04-01 16:17:07,728 - u"Group['hdfs']" {'ignore_failures': False}
> 2015-04-01 16:17:07,728 - Modifying group hdfs
> 2015-04-01 16:17:07,774 - u"User['hdfs']" {'ignore_failures': False, 'groups': [u'hadoop',
'hadoop', 'hdfs', u'hdfs']}
> 2015-04-01 16:17:07,775 - Modifying user hdfs
> 2015-04-01 16:17:07,818 - u"Directory['/etc/hadoop']" {'mode': 0755}
> 2015-04-01 16:17:07,974 - u"Directory['/etc/hadoop/conf.empty']" {'owner': 'root', 'group':
'hadoop', 'recursive': True}
> 2015-04-01 16:17:08,110 - u"Link['/etc/hadoop/conf']" {'not_if': 'ls /etc/hadoop/conf',
'to': '/etc/hadoop/conf.empty'}
> 2015-04-01 16:17:08,153 - Skipping u"Link['/etc/hadoop/conf']" due to not_if
> 2015-04-01 16:17:08,160 - u"File['/etc/hadoop/conf/hadoop-env.sh']" {'content': InlineTemplate(...),
'owner': 'hdfs', 'group': 'hadoop'}
> 2015-04-01 16:17:08,396 - u"Execute['('setenforce', '0')']" {'sudo': True, 'only_if':
'test -f /selinux/enforce'}
> 2015-04-01 16:17:08,448 - Skipping u"Execute['('setenforce', '0')']" due to only_if
> 2015-04-01 16:17:08,448 - u"Directory['/var/log/hadoop']" {'owner': 'root', 'mode': 0775,
'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
> 2015-04-01 16:17:08,843 - u"Directory['/var/run/hadoop']" {'owner': 'root', 'group':
'root', 'recursive': True, 'cd_access': 'a'}
> 2015-04-01 16:17:08,886 - Creating directory u"Directory['/var/run/hadoop']"
> 2015-04-01 16:17:09,066 - Changing group for /var/run/hadoop from 1000 to root
> 2015-04-01 16:17:09,364 - u"Directory['/tmp/hadoop-hdfs']" {'owner': 'hdfs', 'recursive':
True, 'cd_access': 'a'}
> 2015-04-01 16:17:09,407 - Creating directory u"Directory['/tmp/hadoop-hdfs']"
> 2015-04-01 16:17:09,587 - Changing owner for /tmp/hadoop-hdfs from 0 to hdfs
> 2015-04-01 16:17:09,820 - u"File['/etc/hadoop/conf/commons-logging.properties']" {'content':
Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
> 2015-04-01 16:17:10,049 - u"File['/etc/hadoop/conf/health_check']" {'content': Template('health_check-v2.j2'),
'owner': 'hdfs'}
> 2015-04-01 16:17:10,272 - u"File['/etc/hadoop/conf/log4j.properties']" {'content': '...',
'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
> 2015-04-01 16:17:10,506 - u"File['/etc/hadoop/conf/hadoop-metrics2.properties']" {'content':
Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
> 2015-04-01 16:17:10,732 - u"File['/etc/hadoop/conf/task-log4j.properties']" {'content':
StaticFile('task-log4j.properties'), 'mode': 0755}
> 2015-04-01 16:17:11,085 - u"Directory['/etc/knox/conf']" {'owner': 'knox', 'group': 'knox',
'recursive': True}
> 2015-04-01 16:17:11,231 - u"XmlConfig['gateway-site.xml']" {'owner': 'knox', 'group':
'knox', 'conf_dir': '/etc/knox/conf', 'configuration_attributes': {}, 'configurations': ...}
> 2015-04-01 16:17:11,239 - Generating config: /etc/knox/conf/gateway-site.xml
> 2015-04-01 16:17:11,239 - u"File['/etc/knox/conf/gateway-site.xml']" {'owner': 'knox',
'content': InlineTemplate(...), 'group': 'knox', 'mode': None, 'encoding': 'UTF-8'}
> 2015-04-01 16:17:11,422 - Writing u"File['/etc/knox/conf/gateway-site.xml']" because
contents don't match
> 2015-04-01 16:17:11,561 - u"File['/etc/knox/conf/gateway-log4j.properties']" {'content':
'...', 'owner': 'knox', 'group': 'knox', 'mode': 0644}
> 2015-04-01 16:17:11,790 - u"File['/etc/knox/conf/topologies/default.xml']" {'content':
InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}
> 2015-04-01 16:17:12,014 - u"Execute['('chown', '-R', u'knox:knox', '/var/lib/knox/data',
'/var/log/knox', '/var/log/knox', u'/var/run/knox', '/etc/knox/conf')']" {'sudo': True}
> 2015-04-01 16:17:12,075 - Error while executing command 'start':
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
>     method(env)
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 80, in start
>     self.configure(env)
>   File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 64, in configure
>     knox()
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py",
line 99, in knox
>     sudo = True,
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
152, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
118, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 274, in action_run
>     raise ex
> Fail: Execution of 'chown -R knox:knox /var/lib/knox/data /var/log/knox /var/log/knox
/var/run/knox /etc/knox/conf' returned 1. chown: cannot access `/var/run/knox': No such file
or directory
> 2015-04-01 16:17:12,119 - Command: /usr/bin/hdp-select status knox-server > /tmp/tmp7GgVe1
> Output: knox-server - 2.2.0.0-2041



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message