hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wang <andrew.w...@cloudera.com>
Subject Re: high-percentile latency metrics, aka HBASE-6261
Date Thu, 28 Jun 2012 22:53:13 GMT
I put this on the jira too, but the algo I found whittled down a stream of
10 million items down to ~19.5k samples. With each sample at ~36B, that's
~685KiB. There's a bit more from using a LinkedList and general bookkeeping.

Since the estimator is reset every O(minutes) window, and I doubt very many
metrics see more than 10 million items in O(minutes), it seems lightweight
enough to keep going.

I'm planning on doing this in hadoop-common's metrics2 since HDFS is also
interested, backporting to 1.x and 2.x. This would thus depend on the
metrics2 conversion (HBASE-4050) going through too.


On Thu, Jun 28, 2012 at 3:31 PM, Stack <stack@duboce.net> wrote:

> On Tue, Jun 26, 2012 at 6:35 PM, Andrew Wang <andrew.wang@cloudera.com>
> wrote:
> > I wanted to ask off JIRA though about what would be useful in practice. I
> > think it'd be nice to see, for example, accurate 90th and 99th percentile
> > latency over recent 10s, 1m, 5m, and 15m time windows. I found some nice
> > algos to do this, I think at the cost of MBs of memory.
> >
> Agree.
> How many MBs?
> > So, is the "full" solution compelling enough to proceed? Anything
> > missing/extraneous?
> >
> Whats going on is a critical focus going forward so I'd say 'full'
> unless the cost obscene.
> St.Ack

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message