couchdb-erlang mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: starting on metrics
Date Thu, 15 Nov 2012 22:45:35 GMT
Definitely good to make something work to play with. On a related note I
think we need to seriously reevaluate some of the ways we use the config
for these bits (granted, that's a future only tangentially related thing).

As to your list of metrics, I think it depends on what you mean. The
general types of stats that I'm aware of usually fit into a small number of
categories:

counters - generally speaking an atomically incrementing value (ie, open
couchjs processes)
gauges - record an absolute value (ie, CPU temperature)
meters - record a rate of events (ie, HTTP requests)
statsystuff - Slightly more complicated bits for recording stats on
recorded values (ie, request latency with avg/stddev/min/max/percentiles)

And I'd note that you can get away without some of these. Meters can be
implemented with a counter and then using a derivative when graphing
(Graphite does this with the nonNegativeDerivative function).

(Didn't know where to put this, but the middle seems good) Also one thing
we should look into is removing the time series based stats. Ie, the "stats
over last, 1, 60, 300, seonds" stuff as it makes things quite difficult and
AFAIK isn't really useful (especially if you forward to a metrics analysis
system). This would save us significantly in CPU and complexity.

If I were going to write this code I would start by taking a look at a few
other libraries and then figuring out what we might need as an API within
the code base. Right now I could see us getting away with just counters,
gauges, and maybe a basic statsy kind.

Once you have the API then its just a matter of figuring out how to specify
an implementation. I'm not sure what you mean by a custom behavior in this
particular instance. We could write a behavior for a stats processor that
implements the metric types we decide on I guess. Its really not super
duper important other than it provides some compile time checks (but it
also requires figuring out code paths when you compile the module that
implements the behavior (and given that this thing would see high traffic I
would go without cause you'll see if you forgot to implement a function
quite quickly)). The newer couch_index code does stuff kinda like this.
Though its a lot more involved that you'd want to be. Also, more wild ideas
in response to your efficiency questions.

So I can actually think of a couple ways to do this efficiently that will
limit the overhead for implementation. There a bit complex in terms of the
hack, but would be relatively constrained in where the complexity lives.
For the time being I would start with something like mochiglobal to
efficiently decide if you need to record a metric. Although that's a bit
restrictive in that it requires atoms as key names. I have a similar module
I can open source that allows arbitrary keys at the expense of adding a
function clause pattern match. Although if you want to get *really*
awesomely crazy, a fun way to try doing this particular "implementation
swap" would be to dynamically replace the implementation module at runtime
(not as crazy as it sounds, but a bit still slightly crazy). CouchDB could
ship with two versions of this module. One would be the current "expose
values over HTTP" method and one could be a "no-op" that people who just
wanted performance could use (nfc what the performance penalties are of the
current style, though it has tipped nodes over before).

Things to look at for thoughts:

http://metrics.codahale.com/
https://github.com/basho/folsom
https://collectd.org/wiki/index.php/Data_source



On Thu, Nov 15, 2012 at 4:35 PM, Dave Cottlehuber <dch@jsonified.com> wrote:

> On 15 November 2012 14:13, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> > The idea here is good but I'm not at all a fan of the implementation.
> First
> > off, no way should we be choosing a specific stats collection protocol.
> > They're just too specific to a particular operations/infra configuration
> > that anything we pick is going to be inadequate for a non trivial number
> of
> > users.
>
> Absolutely - but as a first go I am learning a lot :-)). First make it
> work, then make it pretty?
>
> Yesterday I hacked in starting up estatsd and enabling/disabling this
> via config file:
>
>
> https://github.com/dch/couchdb/commit/e885e55ee91b77be41363c0fd76414036650dcaa
>
> It's hacky but it works, I think.
>
> > OTOH, I think it would be a very good idea to sit down and design the
> stats
> > API to be pluggable. We already have two rough sides to the API
> (collection
> > vs reporting). If we sat down and designed a collection API that would
> then
> > talk to a configurable reporting API it'd allow for users to do a number
> of
> > cool things with stats.
>
> Nice split.
>
> Re measuring "properly" we could get by with 3 "things":
>
> - counters (http reqs, # of active couchjs procs maybe)
> - duration
> - events (replication started, etc)
>
> And then plug into graphite, riemann, whatever take your fancy. Would
> the best way to provide that API interface these counters be to write
> a custom behaviour? Any existing code you can point to that does this
> sort of thing?
>
> Last question, any tip on how to implement this in a way that you can
> turn off metrics and avoid the performance hit completely, without
> needing a recompile (e.g. to remove macros)?
>
> A+
> Dave
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message