hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-8541) Better high-percentile latency metrics
Date Thu, 28 Jun 2012 23:12:43 GMT
Andrew Wang created HADOOP-8541:
-----------------------------------

             Summary: Better high-percentile latency metrics
                 Key: HADOOP-8541
                 URL: https://issues.apache.org/jira/browse/HADOOP-8541
             Project: Hadoop Common
          Issue Type: Improvement
          Components: metrics
            Reporter: Andrew Wang


Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make better high-percentile
latency metrics a part of hadoop-common.

I've already got a working implementation of [1], an efficient algorithm for estimating quantiles
on a stream of values. It allows you to specify arbitrary quantiles to track (e.g. 50th, 75th,
90th, 95th, 99th), along with tight error bounds. This estimator can be snapshotted and reset
periodically to get a feel for how these percentiles are changing over time.

I propose creating a new MutableQuantiles class that does this. [1] isn't completely without
overhead (~1MB memory for reasonably sized windows), which is why I hesitate to add it to
the existing MutableStat class.

[1] Cormode, Korn, Muthukrishnan, and Srivastava. "Effective Computation of Biased Quantiles
over Data Streams" in ICDE 2005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message