hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
Date Wed, 18 Jul 2012 15:37:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417178#comment-13417178
] 

stack commented on HBASE-6261:
------------------------------

bq. Stack indicated way back on the mailing list that he was okay waiting for a hadoop-common
version bump, which is kind of a long timescale.

Yeah.  Code copied in tends to never go away (For example: see MurmurHash that started out
in hbase and has been in hadoop now w/ a good few years).

bq. If people really urgently want this, we could just copy the code over and then refactor
it away when it's released in hadoop-common.

Sounds like a nice to have.  How much code would you have to copy in?  What would it be? 
Thanks Andrew.


                
> Better approximate high-percentile percentile latency metrics
> -------------------------------------------------------------
>
>                 Key: HBASE-6261
>                 URL: https://issues.apache.org/jira/browse/HBASE-6261
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>              Labels: metrics
>         Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not well-suited for
providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is
a well-studied problem in the literature (see [1] and [2]), the question is determining which
methods best suit our needs and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal memory and
CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable
to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins,
and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency metrics
are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message