ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMBARI-9717) Kafka & Spark service checks fail intermittently on kerberized cluster
Date Fri, 20 Feb 2015 01:12:11 GMT
Alejandro Fernandez created AMBARI-9717:
-------------------------------------------

             Summary: Kafka & Spark service checks fail intermittently on kerberized cluster
                 Key: AMBARI-9717
                 URL: https://issues.apache.org/jira/browse/AMBARI-9717
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.0.0
            Reporter: Alejandro Fernandez
            Assignee: Alejandro Fernandez
             Fix For: 2.0.0


Impact: Prevents RU from completing successfully
Frequency: reproduces often

I ran into this while performing an RU during the following,
* Installed a 3-node cluster with ambari build #427
* Installed HDP 2.2.2.0-2398 on centos 6
* Added HDFS and ZK
* Added Namenode HA
* Added all services (including Spark and Ranger)
* Kerberized the cluster (failed to start due to AMS service check)
* Registered repo HDP 2.2.2.0-2399
* Performed a RU

stdout:
{code}
Running kafka create topic command
2015-02-18 03:29:51,851 - u'Execute[\'source /etc/kafka/conf/kafka-env.sh ; /usr/hdp/current/kafka-broker//bin/kafka-topics.sh
--zookeeper c6403.ambari.apache.org:2181,c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181
--create --topic ambari_kafka_service_check --partitions 1 --replication-factor 1 | grep \'Created
topic "ambari_kafka_service_check".\\|Topic "ambari_kafka_service_check" already exists.\'\']'
{'logoutput': True}
2015-02-18 03:29:54,183 - Error while executing command 'service_check':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 208, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/KAFKA/0.8.1.2.2/package/scripts/service_check.py",
line 37, in service_check
    logoutput=True,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152,
in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118,
in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line
276, in action_run
    raise ex
Fail: Execution of 'source /etc/kafka/conf/kafka-env.sh ; /usr/hdp/current/kafka-broker//bin/kafka-topics.sh
--zookeeper c6403.ambari.apache.org:2181,c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181
--create --topic ambari_kafka_service_check --partitions 1 --replication-factor 1 | grep 'Created
topic "ambari_kafka_service_check".\|Topic "ambari_kafka_service_check" already exists.''
returned 1.
{code}

It turns out that the Kafka topic command can return a nonzero exit code, which is valid,
so the output just needs to be validated against a regex expression.

For Spark, it was not kinit'ing before running the service check. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message