ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Lysnichenko (JIRA)" <>
Subject [jira] [Resolved] (AMBARI-6184) Incorrect value for started_count of Datanode component
Date Wed, 18 Jun 2014 16:14:25 GMT


Dmitry Lysnichenko resolved AMBARI-6184.

    Resolution: Fixed

committed to trunk

> Incorrect value for started_count of Datanode component
> -------------------------------------------------------
>                 Key: AMBARI-6184
>                 URL:
>             Project: Ambari
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.6.1
>            Reporter: Dmitry Lysnichenko
>            Assignee: Dmitry Lysnichenko
>             Fix For: 1.6.1
> *STR:* 
> # Installed a 3-node cluster for HDP 1.3 stack HDFS+MapReduce+Nagios+Ganglia+zooKeeper
installed with slave components installed on all 3 hosts.
> # Enable security with no kerberos setup 
> # On expected failure of security wizard, Disable security.
> # After successfully disabling security, Following API returns incorrect number for started_count
of Datanode. It says 0 but Datanode is actually running on all hosts
> {code}
> http://server:8080/api/v1/clusters/c1/components/?ServiceComponentInfo/,CLIENT)&fields=ServiceComponentInfo/service_name,ServiceComponentInfo/installed_count,ServiceComponentInfo/started_count,ServiceComponentInfo/total_count&minimal_response=true
> {code}
> Reason:
> During wrong kerberos setup DN processes fail to start, but leave stale pid file owned
by root. Next one DN start command starts DN process, but can not override pid file. So the
server considers DN as stopped. If we start DN once more, commands fail soon after start (due
to lock file at data dir owned by already running DN). Agent reports to server that DN is
not running, so server displays a correct information from his point of view. 

This message was sent by Atlassian JIRA

View raw message