ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-9717) Kafka & Spark service checks fail intermittently on kerberized cluster
Date Fri, 20 Feb 2015 01:16:11 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-9717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alejandro Fernandez updated AMBARI-9717:
----------------------------------------
    Attachment: AMBARI-9717.patch

> Kafka & Spark service checks fail intermittently on kerberized cluster
> ----------------------------------------------------------------------
>
>                 Key: AMBARI-9717
>                 URL: https://issues.apache.org/jira/browse/AMBARI-9717
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.0.0
>            Reporter: Alejandro Fernandez
>            Assignee: Alejandro Fernandez
>             Fix For: 2.0.0
>
>         Attachments: AMBARI-9717.patch
>
>
> Impact: Prevents RU from completing successfully
> Frequency: reproduces often
> I ran into this while performing an RU during the following,
> * Installed a 3-node cluster with ambari build #427
> * Installed HDP 2.2.2.0-2398 on centos 6
> * Added HDFS and ZK
> * Added Namenode HA
> * Added all services (including Spark and Ranger)
> * Kerberized the cluster (failed to start due to AMS service check)
> * Registered repo HDP 2.2.2.0-2399
> * Performed a RU
> stdout:
> {code}
> Running kafka create topic command
> 2015-02-18 03:29:51,851 - u'Execute[\'source /etc/kafka/conf/kafka-env.sh ; /usr/hdp/current/kafka-broker//bin/kafka-topics.sh
--zookeeper c6403.ambari.apache.org:2181,c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181
--create --topic ambari_kafka_service_check --partitions 1 --replication-factor 1 | grep \'Created
topic "ambari_kafka_service_check".\\|Topic "ambari_kafka_service_check" already exists.\'\']'
{'logoutput': True}
> 2015-02-18 03:29:54,183 - Error while executing command 'service_check':
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 208, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/common-services/KAFKA/0.8.1.2.2/package/scripts/service_check.py",
line 37, in service_check
>     logoutput=True,
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
152, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
118, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 276, in action_run
>     raise ex
> Fail: Execution of 'source /etc/kafka/conf/kafka-env.sh ; /usr/hdp/current/kafka-broker//bin/kafka-topics.sh
--zookeeper c6403.ambari.apache.org:2181,c6401.ambari.apache.org:2181,c6402.ambari.apache.org:2181
--create --topic ambari_kafka_service_check --partitions 1 --replication-factor 1 | grep 'Created
topic "ambari_kafka_service_check".\|Topic "ambari_kafka_service_check" already exists.''
returned 1.
> {code}
> It turns out that the Kafka topic command can return a nonzero exit code, which is valid,
so the output just needs to be validated against a regex expression.
> For Spark, it was not kinit'ing before running the service check. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message