Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 21409 invoked from network); 10 Feb 2009 15:47:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Feb 2009 15:47:47 -0000 Received: (qmail 42127 invoked by uid 500); 10 Feb 2009 15:47:46 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 42084 invoked by uid 500); 10 Feb 2009 15:47:45 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 42073 invoked by uid 99); 10 Feb 2009 15:47:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2009 07:47:45 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of zachary.zolton@gmail.com designates 74.125.44.30 as permitted sender) Received: from [74.125.44.30] (HELO yx-out-2324.google.com) (74.125.44.30) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2009 15:47:34 +0000 Received: by yx-out-2324.google.com with SMTP id 31so225708yxl.5 for ; Tue, 10 Feb 2009 07:47:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=VJgq1t85GsZiaVDqgKvjohWaAhe6Yg/cQkzGiKhebjg=; b=muNHr4Ji6mCn0zKgZX1OBTsLZPVKa6Fcso40AJX+Bgizo6QMtg2eMLSWPqa07CqQXx wkGZ2GjO+5b/Ms2vhwM4QM3gsHBfyRywhwuYCz6hcEI5bccAgkrwMIcvFKb1iSMHMSdd nTz7bIEk8BoGkiTnARzwKoZEq1NXgvM6Xj1Wg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=b/ZMnAXRoJGWVZ5t6wZhYE6h6sQrN0DUP06g0cfCU0H7wGJo5UvTHoZuysbMZitSc3 pw8LqIbr8utRYNCS9bPDLGjmlyIHUgISbkWGvNKQOGIKjoZY2Vq5HNzLVO83cSaAKft4 QO8a22lFVOYSP9oXTgGhFFGtCnYnmnT/V1Ze4= MIME-Version: 1.0 Received: by 10.100.190.14 with SMTP id n14mr381485anf.19.1234280833544; Tue, 10 Feb 2009 07:47:13 -0800 (PST) In-Reply-To: References: Date: Tue, 10 Feb 2009 09:47:13 -0600 Message-ID: Subject: Re: Stats Patch API Discussion From: Zachary Zolton To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Jan, So you're saying I could run some test, and then hit: GET /_stats/couchdb/request_time?range=$SOME_MINUTES And then, make some changes, and run the same test: GET /_stats/couchdb/request_time?range=$SOME_MINUTES To detect the modulo in performance caused by my changes?!? That's about the level performance tuning I'm comfortable with doing in PostgreSQL, but it's all over HTTP instead. Nice! Cheers, Zach On Tue, Feb 10, 2009 at 9:19 AM, Jan Lehnardt wrote: > Hi, > > Alex and I are working on our stats package patch and the last > bigger issue is the API. It is just exposing a bunch of values by > keys, but as usual, the devil is in the details. > > Let me explain. > > There are two types of counters. "Hit Counters", that record > things like the number of requests. They increase monotonically > each time a request hits CouchDB. This is useful for counting > stuff. Cool. > > Then there are "Absolute Value Counter" (for the lack of a better > term) that collects absolute values like the number of milliseconds > a request took to complete. To create a meaningful metric out > of this type of counter, we need to create averages. There's little > value in recording each individual request (it could still do that > in the access logs) for monitoring reports. So we keep some > aggregate values (min, max, mean, stddev, count (count being > the number of times this counter was called)). > > Complexity++ > > Say you have a CouchDB running for a month. You change some > things in your app or in CouchDB and you'd like to know how this > affected your response time. To effectively see anything you'd have > to restart CouchDB (and lose all stats) or wait a month. If you'd > want to see problems coming up in your monitoring, you need finer > grained time ranges to look at this. > > To make this a little more useful Alex and I introduced time ranges. > These are an additional set of aggregates that get reset every 1, 5 > and 15 minutes. This should be familiar to you from server load. > You can get the aggregate values for four time ranges: > > - Between now and the beginning of time (when CouchDB is > started. > - Between now and 60 seconds ago. > - Between now and 300 seconds ago > - Between now and 900 seconds ago > > These ranges are hardcoded now, but they can be made configurable > at a later time. > > The API would look like this: > > GET /_stats/couchdb/request_time > > { > "couchdb": { > "request_time": { > "description": "Aggregated request time spent in CouchDB since the > beginning of time", > "min":20, > "max":20, > "mean":20, > "stddev":20, > "count":7, > "range":0 // 0 means since day zero. > } > } > } > > To get the aggregates stats for the last minute: > > GET /_stats/couchdb/request_time?range=1 > > { > "couchdb": { > "request_time": { > "description": "Aggregated request time spent in CouchDB since 1 minute > ago", > "min":20, > "max":20, > "mean":20, > "stddev":20, > "count":7, > "range":1 // minute > } > } > } > > Or more generic: > > GET /_stats/couchdb/request_time?range=$range > > { > "couchdb": { > "request_time": { > "description": "Aggregated request time spent in CouchDB since $range > minute ago", > "min":20, > "max":20, > "mean":20, > "stddev":20, > "count":7, > "range":$range // minute > } > } > } > > This seems reasonable. the actual naming of "range" and other > keys can be changed as well as the description text. > > > Complexity-- > > Remember Hit Counters? Yes, strictly speaking, CouchDB shouldn't > want to collect any averages there since our monitoring solution > would take care of that. But then, there are the 4 time-range counters > available and we could just as well populate them as well. Let's > say every second: > > GET /_stats/httpd/requests[?$resolution=[1,5,15]] > > { > "couchdb": { > "request_time": { > "description": "Number of requests per second seconds in the last > $reolution minutes", > "min":20, > "max":20, > "mean":20, > "stddev":20, > "count":7, > "range":$range // minute > } > } > } > > "count" would be the raw counter for the stats and the rest meaningful > aggregates. > > "per second" is an arbitrary choice again and can be made configurable, > if needed. To know at what frequency stats are collected, there's a new > member in the list of aggregates: > > { > "couchdb": { > "request_time": { > "description": "Number of requests per $frequency seconds in the last > $reolution minutes", > "min":20, > "max":20, > "mean":20, > "stddev":20, > "count":7, > "range":$range, // minute > "frequency": 1 // second > } > } > } > > Alex I tried to find a couple of different approaches to get here. Different > URLs for the different types of counters and aggregates, adding members > in different places, with and without description and a whole lot more, > but we sure haven't seen all permutations. > > This solution offers a unified URL format and a human readable as > well as a computer parseable way to determine what kind of counter > you're dealing with. > > To just get all stats you can do a > > GET /_stats/ > > and get a huge JSON object back that includes all of the above for all > resolutions that are currently collected. > > Is there anything that does not make sense or is too complicated? > > The goal was to create a simple, minimal API for a minimal set > of useful statistics and Alex and I hope to have found this by > now. But if you can see how this could be further simplified, > let us know :) > > Alex and I also open for completely different approaches to get > the data out of CouchDB. > > We're looking for a few things in this thread: > > - A sanity check to know we're not completely off. > - A summary (for) you of our way of getting to the current proposal. > - A consensus of dev@-readers for the final API we'd like to implement. > > Note that a few of these things are already implemented and > others need to be adjusted depending on feedback here. > > Please, feed back, > > Cheers > Alex & Jan > -- > >