ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <>
Subject [jira] [Commented] (AMBARI-11821) With HBase master HA Ambari sometimes displays incorrect dashboard information
Date Wed, 10 Jun 2015 04:15:00 GMT


Hadoop QA commented on AMBARI-11821:

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in ambari-server.

Test results:
Console output:

This message is automatically generated.

> With HBase master HA Ambari sometimes displays incorrect dashboard information
> ------------------------------------------------------------------------------
>                 Key: AMBARI-11821
>                 URL:
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-web
>    Affects Versions: 2.1.0
>            Reporter: Jaimin D Jetly
>            Assignee: Jaimin D Jetly
>            Priority: Critical
>             Fix For: 2.1.0
>         Attachments: AMBARI-11821.patch, AMBARI-11821_2.patch
> PROBLEM:  When there is more than one HBase Master certain metrics, in particular the
"Average Load" on the dashboard, are incorrect.  In the case of "Average Load", the load will
read "0".  After checking I noticed that the web UI is hitting the following URL to refresh
the metrics:
> {code}
> http://ADDRESS_OF_AMBARI_SERVER:8080/api/v1/clusters/CLUSTER_NAME/components/?ServiceComponentInfo/component_name=APP_TIMELINE_SERVER|ServiceComponentInfo/category=MASTER&fields=ServiceComponentInfo/Version,ServiceComponentInfo/StartTime,ServiceComponentInfo/HeapMemoryUsed,ServiceComponentInfo/HeapMemoryMax,ServiceComponentInfo/service_name,host_components/HostRoles/host_name,host_components/HostRoles/state,host_components/HostRoles/maintenance_state,host_components/HostRoles/stale_configs,host_components/HostRoles/ha_state,host_components/HostRoles/desired_admin_state,host_components/metrics/jvm/memHeapUsedM,host_components/metrics/jvm/HeapMemoryMax,host_components/metrics/jvm/HeapMemoryUsed,host_components/metrics/jvm/memHeapCommittedM,host_components/metrics/mapred/jobtracker/trackers_decommissioned,host_components/metrics/cpu/cpu_wio,host_components/metrics/rpc/RpcQueueTime_avg_time,host_components/metrics/dfs/FSNamesystem/*,host_components/metrics/dfs/namenode/Version,host_components/metrics/dfs/namenode/LiveNodes,host_components/metrics/dfs/namenode/DeadNodes,host_components/metrics/dfs/namenode/DecomNodes,host_components/metrics/dfs/namenode/TotalFiles,host_components/metrics/dfs/namenode/UpgradeFinalized,host_components/metrics/dfs/namenode/Safemode,host_components/metrics/runtime/StartTime,host_components/metrics/hbase/master/IsActiveMaster,ServiceComponentInfo/MasterStartTime,ServiceComponentInfo/MasterActiveTime,ServiceComponentInfo/AverageLoad,ServiceComponentInfo/Revision,ServiceComponentInfo/RegionsInTransition,metrics/api/v1/cluster/summary,metrics/api/v1/topology/summary,host_components/metrics/yarn/Queue,ServiceComponentInfo/rm_metrics/cluster/activeNMcount,ServiceComponentInfo/rm_metrics/cluster/lostNMcount,ServiceComponentInfo/rm_metrics/cluster/unhealthyNMcount,ServiceComponentInfo/rm_metrics/cluster/rebootedNMcount,ServiceComponentInfo/rm_metrics/cluster/decommissionedNMcount&minimal_response=true
> {code}
> The results that come back seem to sometimes be for the wrong server.  Specifically this
> {code}
>       "ServiceComponentInfo" : {
>         "AverageLoad" : 0.0,
>         "HeapMemoryMax" : 2075918336,
>         "HeapMemoryUsed" : 541616216,
>         "MasterActiveTime" : 0,
>         "MasterStartTime" : 1432752607527,
>         "component_name" : "HBASE_MASTER",
>         "service_name" : "HBASE"
>       },
> {code}
> I'm attaching a file with the output of two different clusters at the customer site.
 In both cases, the average load was not 0, but it shows up that way in the JSON.  Also notice
that for one of the clusters the IsActiveMaster is false, and one is true.  It seems like
there is a disconnect in what comes back in that query URL.
> I reproduced this on a local cluster as follows (HDP 2.2.4 with Ambari 2.0.0):
> 1.  Started hbase service with 2 masters.  I observed that Average Load was displaying
the right (non-zero) value.
> 2.  I then restarted the current active master.  This shifted active to the other one.
 After a minute or so the Average Load went to 0.
> 3.  In one instance for some reason the problem did not happen right away.  I bounced
ambari-server and then the problem happened.  I could restore the reading by shifting active
back to the other master.

This message was sent by Atlassian JIRA

View raw message