Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C069B10885 for ; Fri, 5 Dec 2014 14:21:42 +0000 (UTC) Received: (qmail 42980 invoked by uid 500); 5 Dec 2014 14:21:42 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 42946 invoked by uid 500); 5 Dec 2014 14:21:42 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 42933 invoked by uid 99); 5 Dec 2014 14:21:42 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Dec 2014 14:21:42 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id EB21B1D2256; Fri, 5 Dec 2014 14:21:39 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============3858020756795284004==" MIME-Version: 1.0 Subject: Review Request 28754: Storm fails to start after adding service and restart attempts From: "Andrew Onischuk" To: "Dmytro Sen" Cc: "Andrew Onischuk" , "Ambari" Date: Fri, 05 Dec 2014 14:21:39 -0000 Message-ID: <20141205142139.32525.61927@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Andrew Onischuk" X-ReviewGroup: Ambari X-ReviewRequest-URL: https://reviews.apache.org/r/28754/ X-Sender: "Andrew Onischuk" Reply-To: "Andrew Onischuk" X-ReviewRequest-Repository: ambari --===============3858020756795284004== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28754/ ----------------------------------------------------------- Review request for Ambari and Dmytro Sen. Bugs: AMBARI-8560 https://issues.apache.org/jira/browse/AMBARI-8560 Repository: ambari Description ------- Initially DRPC failed to start at the install start phase of the 'Add service' and after getting back to dashboard, restart 'All Storm Components' failed to start Nimbus: stderr: /var/lib/ambari-agent/data/errors-2269.txt 2014-11-09 22:41:24,539 - Error while executing command 'restart': Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 123, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 233, in restart self.start(env) File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/nimbus.py", line 43, in start service("nimbus", action="start") File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/service.py", line 69, in service path=params.storm_bin_dir File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 241, in action_run raise ex Fail: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. stdout: /var/lib/ambari-agent/data/output-2269.txt 2014-11-09 22:40:06,527 - Execute['mkdir -p /var/lib/ambari-agent/data/tmp/AMBARI-artifacts/; curl -kf -x "" --retry 10 http://pt170o-1.c.pramod-thangali.internal:8080/resources//UnlimitedJCEPolicyJDK7.zip -o /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip'] {'environment': ..., 'not_if': 'test -e /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip', 'ignore_failures': True, 'path': ['/bin', '/usr/bin/']} 2014-11-09 22:40:06,590 - Skipping Execute['mkdir -p /var/lib/ambari-agent/data/tmp/AMBARI-artifacts/; curl -kf -x "" --retry 10 http://pt170o-1.c.pramod-thangali.internal:8080/resources//UnlimitedJCEPolicyJDK7.zip -o /var/lib/ambari-agent/data/tmp/AMBARI-artifacts//UnlimitedJCEPolicyJDK7.zip'] due to not_if 2014-11-09 22:40:06,591 - Group['hadoop'] {'ignore_failures': False} 2014-11-09 22:40:06,592 - Modifying group hadoop 2014-11-09 22:40:06,785 - Group['nobody'] {'ignore_failures': False} 2014-11-09 22:40:06,786 - Modifying group nobody 2014-11-09 22:40:06,930 - Group['users'] {'ignore_failures': False} 2014-11-09 22:40:06,930 - Modifying group users 2014-11-09 22:40:07,064 - Group['nagios'] {'ignore_failures': False} 2014-11-09 22:40:07,065 - Modifying group nagios 2014-11-09 22:40:07,229 - Group['knox'] {'ignore_failures': False} 2014-11-09 22:40:07,229 - Modifying group knox 2014-11-09 22:40:07,383 - User['nobody'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'nobody']} 2014-11-09 22:40:07,384 - Modifying user nobody 2014-11-09 22:40:07,506 - User['hive'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:07,507 - Modifying user hive 2014-11-09 22:40:07,568 - User['oozie'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'users']} 2014-11-09 22:40:07,569 - Modifying user oozie 2014-11-09 22:40:07,643 - User['nagios'] {'gid': 'nagios', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:07,643 - Modifying user nagios 2014-11-09 22:40:07,713 - User['ambari-qa'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'users']} 2014-11-09 22:40:07,714 - Modifying user ambari-qa 2014-11-09 22:40:07,780 - User['flume'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:07,780 - Modifying user flume 2014-11-09 22:40:07,849 - User['hdfs'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:07,849 - Modifying user hdfs 2014-11-09 22:40:07,892 - User['knox'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:07,892 - Modifying user knox 2014-11-09 22:40:07,935 - User['storm'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:07,936 - Modifying user storm 2014-11-09 22:40:07,977 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:07,978 - Modifying user mapred 2014-11-09 22:40:08,056 - User['hbase'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:08,057 - Modifying user hbase 2014-11-09 22:40:08,130 - User['tez'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'users']} 2014-11-09 22:40:08,131 - Modifying user tez 2014-11-09 22:40:08,312 - User['zookeeper'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:08,312 - Modifying user zookeeper 2014-11-09 22:40:08,518 - User['kafka'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:08,519 - Modifying user kafka 2014-11-09 22:40:08,616 - User['falcon'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:08,616 - Modifying user falcon 2014-11-09 22:40:08,716 - User['sqoop'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:08,717 - Modifying user sqoop 2014-11-09 22:40:08,794 - User['yarn'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:08,794 - Modifying user yarn 2014-11-09 22:40:08,890 - User['hcat'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']} 2014-11-09 22:40:08,894 - Modifying user hcat 2014-11-09 22:40:08,960 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2014-11-09 22:40:08,962 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 2>/dev/null'] {'not_if': 'test $(id -u ambari-qa) -gt 1000'} 2014-11-09 22:40:09,009 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 2>/dev/null'] due to not_if 2014-11-09 22:40:09,010 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2014-11-09 22:40:09,011 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/hadoop/hbase 2>/dev/null'] {'not_if': 'test $(id -u hbase) -gt 1000'} 2014-11-09 22:40:09,066 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/hadoop/hbase 2>/dev/null'] due to not_if 2014-11-09 22:40:09,066 - Directory['/etc/hadoop/conf.empty'] {'owner': 'root', 'group': 'root', 'recursive': True} 2014-11-09 22:40:09,067 - Link['/etc/hadoop/conf'] {'not_if': 'ls /etc/hadoop/conf', 'to': '/etc/hadoop/conf.empty'} 2014-11-09 22:40:09,125 - Skipping Link['/etc/hadoop/conf'] due to not_if 2014-11-09 22:40:09,202 - File['/etc/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs'} 2014-11-09 22:40:09,305 - Execute['/bin/echo 0 > /selinux/enforce'] {'only_if': 'test -f /selinux/enforce'} 2014-11-09 22:40:09,456 - Directory['/var/log/hadoop'] {'owner': 'root', 'group': 'hadoop', 'mode': 0775, 'recursive': True} 2014-11-09 22:40:09,457 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': True} 2014-11-09 22:40:09,457 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'recursive': True} 2014-11-09 22:40:09,468 - File['/etc/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'} 2014-11-09 22:40:09,472 - File['/etc/hadoop/conf/health_check'] {'content': Template('health_check-v2.j2'), 'owner': 'hdfs'} 2014-11-09 22:40:09,472 - File['/etc/hadoop/conf/log4j.properties'] {'content': '...', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644} 2014-11-09 22:40:09,483 - File['/etc/hadoop/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'} 2014-11-09 22:40:09,485 - File['/etc/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755} 2014-11-09 22:40:09,951 - Execute['kill `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1'] {'not_if': '! (ls /var/run/storm/nimbus.pid >/dev/null 2>&1 && ps `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1)'} 2014-11-09 22:40:10,291 - Execute['kill -9 `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1'] {'not_if': 'sleep 2; ! (ls /var/run/storm/nimbus.pid >/dev/null 2>&1 && ps `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1) || sleep 20; ! (ls /var/run/storm/nimbus.pid >/dev/null 2>&1 && ps `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1)', 'ignore_failures': True} 2014-11-09 22:40:12,721 - Skipping Execute['kill -9 `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1'] due to not_if 2014-11-09 22:40:12,722 - Execute['rm -f /var/run/storm/nimbus.pid'] {} 2014-11-09 22:40:12,784 - Directory['/var/log/storm'] {'owner': 'storm', 'group': 'hadoop', 'recursive': True, 'mode': 0775} 2014-11-09 22:40:12,785 - Directory['/var/run/storm'] {'owner': 'storm', 'group': 'hadoop', 'recursive': True} 2014-11-09 22:40:12,785 - Directory['/hadoop/storm'] {'owner': 'storm', 'group': 'hadoop', 'recursive': True} 2014-11-09 22:40:12,786 - Directory['/etc/storm/conf'] {'owner': 'storm', 'group': 'hadoop', 'recursive': True} 2014-11-09 22:40:12,814 - File['/etc/storm/conf/config.yaml'] {'owner': 'storm', 'content': Template('config.yaml.j2'), 'group': 'hadoop'} 2014-11-09 22:40:12,890 - File['/etc/storm/conf/storm.yaml'] {'owner': 'storm', 'content': Template('storm.yaml.j2'), 'group': 'hadoop'} 2014-11-09 22:40:12,914 - File['/etc/storm/conf/storm-env.sh'] {'content': InlineTemplate(...), 'owner': 'storm'} 2014-11-09 22:40:12,919 - Execute['env JAVA_HOME=/usr/jdk64/jdk1.7.0_67 PATH=$PATH:/usr/jdk64/jdk1.7.0_67/bin storm nimbus > /var/log/storm/nimbus.out 2>&1'] {'wait_for_finish': False, 'path': ['/usr/hdp/current/storm-client/bin'], 'user': 'storm', 'not_if': 'ls /var/run/storm/nimbus.pid >/dev/null 2>&1 && ps `cat /var/run/storm/nimbus.pid` >/dev/null 2>&1'} 2014-11-09 22:40:13,004 - Execute['/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid'] {'logoutput': True, 'path': ['/usr/hdp/current/storm-client/bin'], 'tries': 6, 'user': 'storm', 'try_sleep': 10} 2014-11-09 22:40:17,201 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 2014-11-09 22:40:31,111 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 2014-11-09 22:40:45,816 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 2014-11-09 22:40:58,878 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 2014-11-09 22:41:10,785 - Retrying after 10 seconds. Reason: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. 2014-11-09 22:41:24,539 - Error while executing command 'restart': Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 123, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 233, in restart self.start(env) File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/nimbus.py", line 43, in start service("nimbus", action="start") File "/var/lib/ambari-agent/cache/stacks/HDP/2.1/services/STORM/package/scripts/service.py", line 69, in service path=params.storm_bin_dir File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 241, in action_run raise ex Fail: Execution of '/usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ && /usr/jdk64/jdk1.7.0_67/bin/jps -l | grep storm.daemon.nimbus$ | awk {'print $1'} > /var/run/storm/nimbus.pid' returned 1. Diffs ----- ambari-server/src/main/resources/stacks/HDP/2.1/services/STORM/package/scripts/service.py 8064ab7 ambari-server/src/test/python/stacks/2.1/STORM/test_storm_drpc_server.py 6c2f28d Diff: https://reviews.apache.org/r/28754/diff/ Testing ------- mvn clean test Thanks, Andrew Onischuk --===============3858020756795284004==--