hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7630) hadoop-metrics2.properties should have a property *.period set to a default value foe metrics
Date Tue, 20 Sep 2011 16:10:12 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108802#comment-13108802
] 

Eric Yang commented on HADOOP-7630:
-----------------------------------

bq. Are you saying simon aggregator could not process less than 1k udp packets per second?

No, that is not what I was saying.  On all production cluster, on the status page, it shows
93% packets lost for disk metrics.  Disk metrics are emitted per disk.  On a typical 2000
nodes cluster, there used to be 4 disk, which turns out to be 8k metrics per 5 seconds.  Single
simon aggregator has problem to handle aggregation load at this scale.  Hadoop metrics is
supposedly smaller than system metrics, but multiply the type of metrics (jvm, roc, mapped,
hdfs), the number of output udp packets would reach the same scale of disk metrics, if something
is not done to reduce the repeated noise.

bq. I'm sure you meant simon aggregator.

No I mean the simon plugin, we want the gauge like metrics to be in sync at the source (MetricsContext)
as well as the plugins.  Internally in simon aggregator, it will use the last know value,
or calculate the missing gap, if there is packet lost.  I wrote the code to handle missing
udp packets for Simon aggregator per management's request.

bq. My point is that you should not change the current default that has potential impact on
production monitoring without actually testing it at scale.

This configuration has been verified to be working at 40 nodes scale.  I am sure that it would
not cause any harm but reduce the potential breaking point.

> hadoop-metrics2.properties should have a property *.period set to a default value foe
metrics
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7630
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7630
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: conf
>            Reporter: Arpit Gupta
>            Assignee: Eric Yang
>             Fix For: 0.20.205.0, 0.23.0
>
>         Attachments: HADOOP-7630-trunk.patch, HADOOP-7630.patch
>
>
> currently the hadoop-metrics2.properties file does not have a value set for *.period
> This property is useful for metrics to determine when the property will refresh. We should
set it to default of 60

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message