couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Stats Patch API Discussion
Date Tue, 10 Feb 2009 16:17:25 GMT

On 10 Feb 2009, at 16:47, Zachary Zolton wrote:

> Jan,
>
> So you're saying I could run some test, and then hit:
>
> GET /_stats/couchdb/request_time?range=$SOME_MINUTES
>
> And then, make some changes, and run the same test:
>
> GET /_stats/couchdb/request_time?range=$SOME_MINUTES
>
> To detect the modulo in performance caused by my changes?!?


Exactly, but where $SOME_MINUTES is hardcoded to 1, 5, or 15
to start with. If you have any reporting and graphing tool connected,
you'd have pretty pictures, too :)


> That's
> about the level performance tuning I'm comfortable with doing in
> PostgreSQL, but it's all over HTTP instead. Nice!

Thanks,
Jan
--


>
>
>
> Cheers,
>
> Zach
>
>
> On Tue, Feb 10, 2009 at 9:19 AM, Jan Lehnardt <jan@apache.org> wrote:
>> Hi,
>>
>> Alex and I are working on our stats package patch and the last
>> bigger issue is the API. It is just exposing a bunch of values by
>> keys, but as usual, the devil is in the details.
>>
>> Let me explain.
>>
>> There are two types of counters. "Hit Counters", that record
>> things like the number of requests. They increase monotonically
>> each time a request hits CouchDB. This is useful for counting
>> stuff. Cool.
>>
>> Then there are "Absolute Value Counter" (for the lack of a better
>> term) that collects absolute values like the number of milliseconds
>> a request took to complete. To create a meaningful metric out
>> of this type of counter, we need to create averages. There's little
>> value in recording each individual request (it could still do that
>> in the access logs) for monitoring reports. So we keep some
>> aggregate values (min, max, mean, stddev, count (count being
>> the number of times this counter was called)).
>>
>> Complexity++
>>
>> Say you have a CouchDB running for a month. You change some
>> things in your app or in CouchDB and you'd like to know how this
>> affected your response time. To effectively see anything you'd have
>> to restart CouchDB (and lose all stats) or wait a month. If you'd
>> want to see problems coming up in your monitoring, you need finer
>> grained time ranges to look at this.
>>
>> To make this a little more useful Alex and I introduced time ranges.
>> These are an additional set of aggregates that get reset every 1, 5
>> and 15 minutes. This should be familiar to you from server load.
>> You can get the aggregate values for four time ranges:
>>
>> - Between now and the beginning of time (when CouchDB is
>> started.
>> - Between now and 60 seconds ago.
>> - Between now and 300 seconds ago
>> - Between now and 900 seconds ago
>>
>> These ranges are hardcoded now, but they can be made configurable
>> at a later time.
>>
>> The API would look like this:
>>
>> GET /_stats/couchdb/request_time
>>
>> {
>> "couchdb": {
>>  "request_time": {
>>    "description": "Aggregated request time spent in CouchDB since the
>> beginning of time",
>>    "min":20,
>>    "max":20,
>>    "mean":20,
>>    "stddev":20,
>>    "count":7,
>>    "range":0 // 0 means since day zero.
>>  }
>> }
>> }
>>
>> To get the aggregates stats for the last minute:
>>
>> GET /_stats/couchdb/request_time?range=1
>>
>> {
>> "couchdb": {
>>  "request_time": {
>>    "description": "Aggregated request time spent in CouchDB since 1  
>> minute
>> ago",
>>    "min":20,
>>    "max":20,
>>    "mean":20,
>>    "stddev":20,
>>    "count":7,
>>    "range":1 // minute
>>  }
>> }
>> }
>>
>> Or more generic:
>>
>> GET /_stats/couchdb/request_time?range=$range
>>
>> {
>> "couchdb": {
>>  "request_time": {
>>    "description": "Aggregated request time spent in CouchDB since  
>> $range
>> minute ago",
>>    "min":20,
>>    "max":20,
>>    "mean":20,
>>    "stddev":20,
>>    "count":7,
>>    "range":$range // minute
>>  }
>> }
>> }
>>
>> This seems reasonable. the actual naming of "range" and other
>> keys can be changed as well as the description text.
>>
>>
>> Complexity--
>>
>> Remember Hit Counters? Yes, strictly speaking, CouchDB shouldn't
>> want to collect any averages there since our monitoring solution
>> would take care of that. But then, there are the 4 time-range  
>> counters
>> available and we could just as well populate them as well. Let's
>> say every second:
>>
>> GET /_stats/httpd/requests[?$resolution=[1,5,15]]
>>
>> {
>> "couchdb": {
>>  "request_time": {
>>    "description": "Number of requests per second seconds in the last
>> $reolution minutes",
>>    "min":20,
>>    "max":20,
>>    "mean":20,
>>    "stddev":20,
>>    "count":7,
>>    "range":$range // minute
>>  }
>> }
>> }
>>
>> "count" would be the raw counter for the stats and the rest  
>> meaningful
>> aggregates.
>>
>> "per second" is an arbitrary choice again and can be made  
>> configurable,
>> if needed. To know at what frequency stats are collected, there's a  
>> new
>> member in the list of aggregates:
>>
>> {
>> "couchdb": {
>>  "request_time": {
>>    "description": "Number of requests per $frequency seconds in the  
>> last
>> $reolution minutes",
>>    "min":20,
>>    "max":20,
>>    "mean":20,
>>    "stddev":20,
>>    "count":7,
>>    "range":$range, // minute
>>    "frequency": 1 // second
>>  }
>> }
>> }
>>
>> Alex I tried to find a couple of different approaches to get here.  
>> Different
>> URLs for the different types of counters and aggregates, adding  
>> members
>> in different places, with and without description and a whole lot  
>> more,
>> but we sure haven't seen all permutations.
>>
>> This solution offers a unified URL format and a human readable as
>> well as a computer parseable way to determine what kind of counter
>> you're dealing with.
>>
>> To just get all stats you can do a
>>
>> GET /_stats/
>>
>> and get a huge JSON object back that includes all of the above for  
>> all
>> resolutions that are currently collected.
>>
>> Is there anything that does not make sense or is too complicated?
>>
>> The goal was to create a simple, minimal API for a minimal set
>> of useful statistics and Alex and I hope to have found this by
>> now. But if you can see how this could be further simplified,
>> let us know :)
>>
>> Alex and I also open for completely different approaches to get
>> the data out of CouchDB.
>>
>> We're looking for a few things in this thread:
>>
>> - A sanity check to know we're not completely off.
>> - A summary (for) you of our way of getting to the current proposal.
>> - A consensus of dev@-readers for the final API we'd like to  
>> implement.
>>
>> Note that a few of these things are already implemented and
>> others need to be adjusted depending on feedback here.
>>
>> Please, feed back,
>>
>> Cheers
>> Alex & Jan
>> --
>>
>>
>


Mime
View raw message