ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Onischuk (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (AMBARI-8560) Storm fails to start after adding service and restart attempts
Date Fri, 05 Dec 2014 18:59:13 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Onischuk resolved AMBARI-8560.
-------------------------------------
    Resolution: Fixed

Committed to trunk

> Storm fails to start after adding service and restart attempts
> --------------------------------------------------------------
>
>                 Key: AMBARI-8560
>                 URL: https://issues.apache.org/jira/browse/AMBARI-8560
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Andrew Onischuk
>            Assignee: Andrew Onischuk
>             Fix For: 2.0.0
>
>
> Initially DRPC failed to start at the install start phase of the 'Add service'
> and after getting back to dashboard, restart 'All Storm Components' failed to
> start Nimbus:
>     
>     
>     stderr:   /var/lib/ambari-agent/data/errors-2269.txt
>     
>     2014-11-09 22:41:24,539 - Error while executing command 'restart':
>     Traceback (most recent call last):
>       File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 123, in execute
>         method(env)
>       File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 233, in restart
>         self.start(env)
>       File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/nimbus.py",
line 43, in start
>         service("nimbus", action="start")
>       File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/service.py",
line 69, in service
>         path=params.storm_bin_dir
>       File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line
148, in __init__
>         self.env.run()
>       File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 149, in run
>         self.run_action(resource, action)
>       File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 115, in run_action
>         provider_action()
>       File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 241, in action_run
>         raise ex
>     Fail: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
&& /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$ | awk {'print $1'}
> /var/run/storm/nimbus.pid' returned 1.
>     stdout:   /var/lib/ambari-agent/data/output-2269.txt
>     
>     2014-11-09 22:40:06,527 - Execute['mkdir -p /var/lib/ambari-agent/data/tmp/AMBARI-artifacts/;
    curl -kf -x "" --retry 10     http://pt170o-1.c.pramod-thangali.internal:8080/resources//UnlimitedJCEPolicyJDK7.zip
-o /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip'] {'environment':
..., 'not_if': 'test -e /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip',
'ignore_failures': True, 'path': ['/bin', '/usr/bin/']}
>     2014-11-09 22:40:06,590 - Skipping Execute['mkdir -p /var/lib/ambari-agent/data/tmp/AMBARI-artifacts/;
    curl -kf -x "" --retry 10     http://pt170o-1.c.pramod-thangali.internal:8080/resources//UnlimitedJCEPolicyJDK7.zip
-o /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip'] due to not_if
>     2014-11-09 22:40:06,591 - Group['hadoop'] {'ignore_failures': False}
>     2014-11-09 22:40:06,592 - Modifying group hadoop
>     2014-11-09 22:40:06,785 - Group['nobody'] {'ignore_failures': False}
>     2014-11-09 22:40:06,786 - Modifying group nobody
>     2014-11-09 22:40:06,930 - Group['users'] {'ignore_failures': False}
>     2014-11-09 22:40:06,930 - Modifying group users
>     2014-11-09 22:40:07,064 - Group['nagios'] {'ignore_failures': False}
>     2014-11-09 22:40:07,065 - Modifying group nagios
>     2014-11-09 22:40:07,229 - Group['knox'] {'ignore_failures': False}
>     2014-11-09 22:40:07,229 - Modifying group knox
>     2014-11-09 22:40:07,383 - User['nobody'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'nobody']}
>     2014-11-09 22:40:07,384 - Modifying user nobody
>     2014-11-09 22:40:07,506 - User['hive'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:07,507 - Modifying user hive
>     2014-11-09 22:40:07,568 - User['oozie'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'users']}
>     2014-11-09 22:40:07,569 - Modifying user oozie
>     2014-11-09 22:40:07,643 - User['nagios'] {'gid': 'nagios', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:07,643 - Modifying user nagios
>     2014-11-09 22:40:07,713 - User['ambari-qa'] {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'users']}
>     2014-11-09 22:40:07,714 - Modifying user ambari-qa
>     2014-11-09 22:40:07,780 - User['flume'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:07,780 - Modifying user flume
>     2014-11-09 22:40:07,849 - User['hdfs'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:07,849 - Modifying user hdfs
>     2014-11-09 22:40:07,892 - User['knox'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:07,892 - Modifying user knox
>     2014-11-09 22:40:07,935 - User['storm'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:07,936 - Modifying user storm
>     2014-11-09 22:40:07,977 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:07,978 - Modifying user mapred
>     2014-11-09 22:40:08,056 - User['hbase'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:08,057 - Modifying user hbase
>     2014-11-09 22:40:08,130 - User['tez'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'users']}
>     2014-11-09 22:40:08,131 - Modifying user tez
>     2014-11-09 22:40:08,312 - User['zookeeper'] {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
>     2014-11-09 22:40:08,312 - Modifying user zookeeper
>     2014-11-09 22:40:08,518 - User['kafka'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:08,519 - Modifying user kafka
>     2014-11-09 22:40:08,616 - User['falcon'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:08,616 - Modifying user falcon
>     2014-11-09 22:40:08,716 - User['sqoop'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:08,717 - Modifying user sqoop
>     2014-11-09 22:40:08,794 - User['yarn'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:08,794 - Modifying user yarn
>     2014-11-09 22:40:08,890 - User['hcat'] {'gid': 'hadoop', 'ignore_failures': False,
'groups': [u'hadoop']}
>     2014-11-09 22:40:08,894 - Modifying user hcat
>     2014-11-09 22:40:08,960 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content':
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
>     2014-11-09 22:40:08,962 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa
2>/dev/null'] {'not_if': 'test $(id -u ambari-qa) -gt 1000'}
>     2014-11-09 22:40:09,009 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh
ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa
2>/dev/null'] due to not_if
>     2014-11-09 22:40:09,010 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content':
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
>     2014-11-09 22:40:09,011 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase
/home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/hadoop/hbase 2>/dev/null'] {'not_if':
'test $(id -u hbase) -gt 1000'}
>     2014-11-09 22:40:09,066 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh
hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/hadoop/hbase 2>/dev/null']
due to not_if
>     2014-11-09 22:40:09,066 - Directory['/etc/hadoop/conf.empty'] {'owner': 'root', 'group':
'root', 'recursive': True}
>     2014-11-09 22:40:09,067 - Link['/etc/hadoop/conf'] {'not_if': 'ls /etc/hadoop/conf',
'to': '/etc/hadoop/conf.empty'}
>     2014-11-09 22:40:09,125 - Skipping Link['/etc/hadoop/conf'] due to not_if
>     2014-11-09 22:40:09,202 - File['/etc/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...),
'owner': 'hdfs'}
>     2014-11-09 22:40:09,305 - Execute['/bin/echo 0 > /selinux/enforce'] {'only_if':
'test -f /selinux/enforce'}
>     2014-11-09 22:40:09,456 - Directory['/var/log/hadoop'] {'owner': 'root', 'group':
'hadoop', 'mode': 0775, 'recursive': True}
>     2014-11-09 22:40:09,457 - Directory['/var/run/hadoop'] {'owner': 'root', 'group':
'root', 'recursive': True}
>     2014-11-09 22:40:09,457 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'recursive':
True}
>     2014-11-09 22:40:09,468 - File['/etc/hadoop/conf/commons-logging.properties'] {'content':
Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
>     2014-11-09 22:40:09,472 - File['/etc/hadoop/conf/health_check'] {'content': Template('health_check-v2.j2'),
'owner': 'hdfs'}
>     2014-11-09 22:40:09,472 - File['/etc/hadoop/conf/log4j.properties'] {'content': '...',
'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
>     2014-11-09 22:40:09,483 - File['/etc/hadoop/conf/hadoop-metrics2.properties'] {'content':
Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
>     2014-11-09 22:40:09,485 - File['/etc/hadoop/conf/task-log4j.properties'] {'content':
StaticFile('task-log4j.properties'), 'mode': 0755}
>     2014-11-09 22:40:09,951 - Execute['kill `cat /var/run/storm/nimbus.pid` >/dev/null
2>&1'] {'not_if': '! (ls /var/run/storm/nimbus.pid >/dev/null 2>&1 &&
ps `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1)'}
>     2014-11-09 22:40:10,291 - Execute['kill -9 `cat /var/run/storm/nimbus.pid` >/dev/null
2>&1'] {'not_if': 'sleep 2; ! (ls /var/run/storm/nimbus.pid >/dev/null 2>&1
&& ps `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1) || sleep 20; ! (ls
/var/run/storm/nimbus.pid >/dev/null 2>&1 && ps `cat /var/run/storm/nimbus.pid`
>/dev/null 2>&1)', 'ignore_failures': True}
>     2014-11-09 22:40:12,721 - Skipping Execute['kill -9 `cat /var/run/storm/nimbus.pid`
>/dev/null 2>&1'] due to not_if
>     2014-11-09 22:40:12,722 - Execute['rm -f /var/run/storm/nimbus.pid'] {}
>     2014-11-09 22:40:12,784 - Directory['/var/log/storm'] {'owner': 'storm', 'group':
'hadoop', 'recursive': True, 'mode': 0775}
>     2014-11-09 22:40:12,785 - Directory['/var/run/storm'] {'owner': 'storm', 'group':
'hadoop', 'recursive': True}
>     2014-11-09 22:40:12,785 - Directory['/hadoop/storm'] {'owner': 'storm', 'group':
'hadoop', 'recursive': True}
>     2014-11-09 22:40:12,786 - Directory['/etc/storm/conf'] {'owner': 'storm', 'group':
'hadoop', 'recursive': True}
>     2014-11-09 22:40:12,814 - File['/etc/storm/conf/config.yaml'] {'owner': 'storm',
'content': Template('config.yaml.j2'), 'group': 'hadoop'}
>     2014-11-09 22:40:12,890 - File['/etc/storm/conf/storm.yaml'] {'owner': 'storm', 'content':
Template('storm.yaml.j2'), 'group': 'hadoop'}
>     2014-11-09 22:40:12,914 - File['/etc/storm/conf/storm-env.sh'] {'content': InlineTemplate(...),
'owner': 'storm'}
>     2014-11-09 22:40:12,919 - Execute['env JAVA_HOME=/usr/jdk64/jdk1.7.0_67 PATH=$PATH:/usr/jdk64/jdk1.7.0_67/bin
storm nimbus > /var/log/storm/nimbus.out 2>&1'] {'wait_for_finish': False, 'path':
['/usr/hdp/current/storm-client/bin'], 'user': 'storm', 'not_if': 'ls /var/run/storm/nimbus.pid
>/dev/null 2>&1 && ps `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1'}
>     2014-11-09 22:40:13,004 - Execute['/usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
&& /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$ | awk {'print $1'}
> /var/run/storm/nimbus.pid'] {'logoutput': True, 'path': ['/usr/hdp/current/storm-client/bin'],
'tries': 6, 'user': 'storm', 'try_sleep': 10}
>     2014-11-09 22:40:17,201 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps
-l  | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
| awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 
>     2014-11-09 22:40:31,111 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps
-l  | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
| awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 
>     2014-11-09 22:40:45,816 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps
-l  | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
| awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 
>     2014-11-09 22:40:58,878 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps
-l  | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
| awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 
>     2014-11-09 22:41:10,785 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps
-l  | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
| awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 
>     2014-11-09 22:41:24,539 - Error while executing command 'restart':
>     Traceback (most recent call last):
>       File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 123, in execute
>         method(env)
>       File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 233, in restart
>         self.start(env)
>       File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/nimbus.py",
line 43, in start
>         service("nimbus", action="start")
>       File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/service.py",
line 69, in service
>         path=params.storm_bin_dir
>       File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line
148, in __init__
>         self.env.run()
>       File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 149, in run
>         self.run_action(resource, action)
>       File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 115, in run_action
>         provider_action()
>       File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 241, in action_run
>         raise ex
>     Fail: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$
&& /usr/jdk64/jdk1.7.0_67/bin/jps -l  | grep storm.daemon.nimbus$ | awk {'print $1'}
> /var/run/storm/nimbus.pid' returned 1.
>     



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message