ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Lysnichenko (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMBARI-6184) Incorrect value for started_count of Datanode component
Date Wed, 18 Jun 2014 14:04:54 GMT
Dmitry Lysnichenko created AMBARI-6184:
------------------------------------------

             Summary: Incorrect value for started_count of Datanode component
                 Key: AMBARI-6184
                 URL: https://issues.apache.org/jira/browse/AMBARI-6184
             Project: Ambari
          Issue Type: Bug
          Components: agent
    Affects Versions: 1.6.1
            Reporter: Dmitry Lysnichenko
            Assignee: Dmitry Lysnichenko
             Fix For: 1.6.1


*STR:* 
# Installed a 3-node cluster for HDP 1.3 stack HDFS+MapReduce+Nagios+Ganglia+zooKeeper installed
with slave components installed on all 3 hosts.
# Enable security with no kerberos setup 
# On expected failure of security wizard, Disable security.
# After successfully disabling security, Following API returns incorrect number for started_count
of Datanode. It says 0 but Datanode is actually running on all hosts
{code}
http://server:8080/api/v1/clusters/c1/components/?ServiceComponentInfo/category.in(SLAVE,CLIENT)&fields=ServiceComponentInfo/service_name,ServiceComponentInfo/installed_count,ServiceComponentInfo/started_count,ServiceComponentInfo/total_count&minimal_response=true
{code}

Reason:
During wrong kerberos setup DN processes fail to start, but leave stale pid file owned by
root. Next one DN start command starts DN process, but can not override pid file. So the server
considers DN as stopped. If we start DN once more, commands fail soon after start (due to
lock file at data dir owned by already running DN). Agent reports to server that DN is not
running, so server displays a correct information from his point of view. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message