hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8541) Better high-percentile latency metrics
Date Tue, 10 Jul 2012 18:42:35 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410690#comment-13410690
] 

Andrew Wang commented on HADOOP-8541:
-------------------------------------

Thanks for the very detailed review Aaron. The Eclipse auto-formatter should have gotten the
style comments, and I tried to address most of the rest.

For 11, since memory usage seemed to be a major concern, I wanted to avoid the object overhead
from using Longs instead of the primitive type. This is O(KBs) though, so I can change it
if you think it's not readable. The usage of the array is pretty simple though, there isn't
any weird iteration, and it's only flushed completely.

For 13, I'd have to think about how to make this work. It's probably doable (basically need
to merge adjacent items instead of inserting), but I don't think it'll yield that big of a
performance boost. Again, I'll work on this if you think it's worthwhile.
                
> Better high-percentile latency metrics
> --------------------------------------
>
>                 Key: HADOOP-8541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8541
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 2.0.0-alpha
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hadoop-8541-1.patch, hadoop-8541-2.patch, hadoop-8541-3.patch, hadoop-8541-4.patch
>
>
> Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make better high-percentile
latency metrics a part of hadoop-common.
> I've already got a working implementation of [1], an efficient algorithm for estimating
quantiles on a stream of values. It allows you to specify arbitrary quantiles to track (e.g.
50th, 75th, 90th, 95th, 99th), along with tight error bounds. This estimator can be snapshotted
and reset periodically to get a feel for how these percentiles are changing over time.
> I propose creating a new MutableQuantiles class that does this. [1] isn't completely
without overhead (~1MB memory for reasonably sized windows), which is why I hesitate to add
it to the existing MutableStat class.
> [1] Cormode, Korn, Muthukrishnan, and Srivastava. "Effective Computation of Biased Quantiles
over Data Streams" in ICDE 2005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message