ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Sen <d...@hortonworks.com>
Subject Re: metrics visible from host_component but not component
Date Tue, 10 Nov 2015 23:04:31 GMT
Check ambari-metrics-collector.log for any exceptions

________________________________
From: Siddharth Wagle <swagle@hortonworks.com>
Sent: Tuesday, November 10, 2015 8:57 PM
To: user@ambari.apache.org
Subject: Re: metrics visible from host_component but not component


HBase is case-sensitive, can you make sure the case for timelineAppId is all lowercase.



________________________________
From: Nathan Falk <nfalk@us.ibm.com>
Sent: Tuesday, November 10, 2015 10:24 AM
To: user@ambari.apache.org
Subject: Re: metrics visible from host_component but not component


Sid, Dmitry,

Thanks for the tips.

I had already been specifying the timelineAppid for the service component in metainfo.xml:

  <services>
    <service>
      <name>GPFS</name>
      <displayName>Spectrum Scale</displayName>
      <comment>High-performance, scalable storage manages yottabytes of unstructured
data (formerly known as General Parallel File System, or GPFS)</comment>
      <version>4.1.1</version>
      <components>

        <component>
          <name>GPFS_MASTER</name>
          <displayName>GPFS Master</displayName>
          <category>MASTER</category>
          <cardinality>1</cardinality>
          <timelineAppid>GPFS</timelineAppid>
          <commandScript>
            <script>scripts/master.py</script>
            <scriptType>PYTHON</scriptType>
            <timeout>600</timeout>
          </commandScript>

I am unable to query AMS by appId, though. Only by host.

[root@dn01-dat ~]# curl -X GET "http://dn01:6188/ws/v1/timeline/metrics?metricNames=gpfs.disk_used&hostname=dn01-dat.ibm.com"{"metrics":[{"timestamp":1447177544905,"metricname":"gpfs.disk_used","appid":"gpfs_master","hostname":"dn01-dat.ibm.com","starttime":1447177544,"metrics":{"1447177544":1439744.0}},{"timestamp":1447175401845,"metricname":"gpfs.disk_used","appid":"gpfs","hostname":"dn01-dat.ibm.com","starttime":1447175401,"metrics":{"1447175401":1439744.0}},{"timestamp":1447175289376,"metricname":"gpfs.disk_used","appid":"gpfs","hostname":"dn01-dat.ibm.com","starttime":1447175289,"metrics":{"1447175289":1439744.0}},{"timestamp":1447171202994,"metricname":"gpfs.disk_used","appid":"gpfs","hostname":"dn01-dat.ibm.com","starttime":1447171202,"metrics":{"1447171202":1439744.0}}]}

[root@dn01-dat ~]# curl -X GET "http://dn01:6188/ws/v1/timeline/metrics?metricNames=gpfs.disk_used&appId=gpfs"
{"metrics":[]}

[root@dn01-dat ~]# curl -X GET "http://dn01:6188/ws/v1/timeline/metrics?metricNames=gpfs.disk_used&appId=gpfs_master"
{"metrics":[]}


I have also experimented with setting the timeline.metrics.service.cluster.aggregator.appIds
property in ams-site to include the "gpfs" appId (followed by a restart of AMS services).
That did not seem to change anything.

Thanks,

Nate Falk
nfalk@us.ibm.com

[Inactive hide details for Siddharth Wagle ---11/10/2015 01:06:30 PM---Hi, There is a field
called timelineAppId, example:]Siddharth Wagle ---11/10/2015 01:06:30 PM---Hi, There is a
field called timelineAppId, example:

From: Siddharth Wagle <swagle@hortonworks.com>
To: "user@ambari.apache.org" <user@ambari.apache.org>
Date: 11/10/2015 01:06 PM
Subject: Re: metrics visible from host_component but not component

________________________________



Hi,

There is a field called timelineAppId, example:
common-services/ACCUMULO/1.6.1.2.2.0/metainfo.xml

This will override what Ambari uses in the call to AMS, you can set this to gpfs.

- Sid

________________________________

From: Dmitry Sen <dsen@hortonworks.com>
Sent: Tuesday, November 10, 2015 9:21 AM
To: user@ambari.apache.org
Subject: Re: metrics visible from host_component but not component


Hi,

AMS doesn't support custom services metrics yet, but I can propose some points to check

Try to call
curl -X GET -u admin:admin "http://dn01:6188/ws/v1/timeline/metrics?metricNames=gpfs.disk_used&appId=gpfs"

If you haven't get the metrics, then something to be fixed on AMS side. Otherwise:

When you do
curl -X GET -u admin:admin "http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/gpfs/disk_used"

Ambari calls
curl -X GET -u admin:admin "http://dn01:6188/ws/v1/timeline/metrics?metricNames=gpfs.disk_used&appId=gpfs_master"

But as I can see in your previous message, appId is "gpfs".
{"metrics":[{"timestamp":1447084964323,"metricname":"gpfs.disk_used","appid":"gpfs","hostname":"dn01-dat.ibm.com","starttime":1447084964,"metrics":{"1447084964":1437696.0}}]}

Your custom application should report appId as gpfs_master, but not gpfs. Another option is
to rename GPFS_MASTER component to GPFS in metainfo.xml

BR,
Dmytro Sen
​
​​

________________________________

From: Nathan Falk <nfalk@us.ibm.com>
Sent: Tuesday, November 10, 2015 6:37 PM
To: user@ambari.apache.org
Subject: metrics visible from host_component but not component

I have a custom Ambari service, with a metrics.json and widgets.json defined.

The widgets display on the service dashboard summary page, but instead of the graph or data,
I see "n/a".

When I use the REST API to query the ambari server, I see the metrics for the host_component,
but not when I query the component.

In metrics.json, I've added some of the basic ams host metrics, plus some service-specific
metrics. All metrics are defined in both "Component" and "HostComponent". As an example:

{
 "GPFS_MASTER": {
   "Component": [
     {
       "type": "ganglia",
       "metrics": {
         "default": {
           "metrics/cpu/cpu_idle":{
             "metric":"cpu_idle",
             "pointInTime":true,
             "temporal":true,
             "amsHostMetric":true
           },
           ...
           "metrics/gpfs/disk_used": {
             "metric": "gpfs.disk_used",
             "pointInTime": true,
             "temporal": true
           },
           ...
         }
       }
     }
   ],
   "HostComponent": [
     {
       "type": "ganglia",
       "metrics": {
         "default": {
           "metrics/cpu/cpu_idle":{
             "metric":"cpu_idle",
             "pointInTime":true,
             "temporal":true,
             "amsHostMetric":true
           },
           ...
           "metrics/gpfs/disk_used": {
             "metric": "gpfs.disk_used",
             "pointInTime": true,
             "temporal": true
           },
           ...


I query the AMS Collector, and it seems that the metrics are there:
[root@dn01-dat nathan]# curl -X GET -u admin:admin "http://dn01:6188/ws/v1/timeline/metrics?metricNames=gpfs.disk_used&hostname=dn01-dat.ibm.com"
{"metrics":[{"timestamp":1447084964323,"metricname":"gpfs.disk_used","appid":"gpfs","hostname":"dn01-dat.ibm.com","starttime":1447084964,"metrics":{"1447084964":1437696.0}}]}

I query Ambari, and whether I see the metric or not depends on how I do the query. If I query
the GPFS_MASTER service component, I do NOT see the metric:
[root@dn01-dat nathan]# curl -X GET -u admin:admin "http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/gpfs/disk_used"
{
 "href" : "http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/gpfs/disk_used",
 "ServiceComponentInfo" : {
   "cluster_name" : "nate",
   "component_name" : "GPFS_MASTER",
   "service_name" : "GPFS"
 }
}

If I query the GPFS_MASTER host component on dn01, then I do see the metric:
[root@dn01-dat nathan]# curl -X GET -u admin:admin "http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/gpfs/disk_used"
{
 "href" : "http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/gpfs/disk_used",
 "HostRoles" : {
   "cluster_name" : "nate",
   "component_name" : "GPFS_MASTER",
   "host_name" : "dn01-dat.ibm.com"
 },
 "host" : {
   "href" : "http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com"
 },
 "metrics" : {
   "gpfs" : {
     "disk_used" : 1437696.0
   }
 }
}

By comparison, if I query the "cpu_idle" metric, also defined in the GPFS metrics.json file,
I see the metric in both queries:
[root@dn01-dat nathan]# curl -X GET -u admin:admin "http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/cpu/cpu_idle"
{
 "href" : "http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/cpu/cpu_idle",
 "ServiceComponentInfo" : {
   "cluster_name" : "nate",
   "component_name" : "GPFS_MASTER",
   "service_name" : "GPFS"
 },
 "metrics" : {
   "cpu" : {
     "cpu_idle" : 0.6248046875
   }
 }
}[root@dn01-dat nathan]#
[root@dn01-dat nathan]# curl -X GET -u admin:admin "http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/cpu/cpu_idle"
{
 "href" : "http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/cpu/cpu_idle",
 "HostRoles" : {
   "cluster_name" : "nate",
   "component_name" : "GPFS_MASTER",
   "host_name" : "dn01-dat.ibm.com"
 },
 "host" : {
   "href" : "http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com"
 },
 "metrics" : {
   "cpu" : {
     "cpu_idle" : 0.624375
   }
 }
}

I feel like getting back "n/a" on the widgets is related to not seeing the metrics when I
query the component rather than the host_component, but I'm not 100% sure about that either.

My problems don't seem to end there, either. When I create new widgets using the gpfs metrics,
I start seeing some wildly inconsistent behavior. Sometimes I'll get the right metric data,
sometimes as I add and remove widgets they'll go back to displaying n/a or even displaying
old values for the metric data.

I must be missing something really simple, but I think I'm going to need help to figure out
what that might be.

Does anyone out there have any suggestions for how to investigate this further or what I might
be missing with regard to defining or posting these metrics?

Thanks,

Nate Falk
nfalk@us.ibm.com


Mime
View raw message