couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Statistics Module
Date Sat, 31 Jan 2009 01:12:16 GMT
On Fri, Jan 30, 2009 at 2:42 PM, Jan Lehnardt <jan@apache.org> wrote:
> Hi Paul,
>
> thanks for your feedback.
>
> On 30 Jan 2009, at 00:26, Paul Davis wrote:
>>
>> Two concerns:
>>
>> The way that stats are calculated currently with the dependent
>> variable being time could cause some issues in implementing more
>> statistics. With my extremely limited knowledge of stats I think
>> moving that to be dependent on the number of requests might be better.
>> This is something that hopefully someone out there knows more about.
>> (This is in terms of "avg for last 5 minutes" vs "avg for last 100
>> requests", (the later of the two making stddev type stats
>> calculateable on the fly in constant memory.)
>
> I think you are going like the new aggregator :)
>

Code looks awesome. For reference, they're storing the last N samples
with each time point over a given time period. Ie, they store the
current counter values once a second.

>
>> I'm sorry, but i gotta say: 'should keep the same number of open
>> databases when reaching the max_dbs_open limit' is a really fucking
>> verbose test name. I'm all for making our tests more granular, etc.
>> but those names are just incomprehensible. Perhaps we can abandon the
>> novelas with some better categorization semantics?
>
> The idea here is natural language product description that makes clear
> what is happening without you having to look at the test code which
> can be non-obvious. If a test fails, you get a neat error message and
> you are ready to go.
>
> Granted, the two non-native speakers at work here might have not
> picked the most concise wording, but that can be improved without
> changing the general style. http://rspec.info/ has more information
> about this style. I've seen Alex having great successes with this style
> of BDD/TDD driven development and I am happy he pushed me into
> adapting it for this.
>
> One thing to think about that behaviour specification is not unit testing,
> but that's not what the JS suite has been doing anyway. Maybe we can
> split things up.
>

I find fault with the logic that long names in any way lead to
success. The two things that this style does offer is the focus on
small independent tests and a hierarchical categorization of tests.

Just pulling out the first section:

  stats: function(debug) {
    var open_databases_tests = {
      'should increment the number of open databases when creating a
db': function...
      'should increment the number of open databases when opening a
db': function...
      'should decrement the number of open databases when deleting': function...
      'should keep the same number of open databases when reaching the
max_dbs_open limit': function...
      'should return 0 for number of open databases after call to
restartServer()': function....

Contrast that with:

stats: {
    database: {
        create_inc: function() .....
        open_inc: function() ...
        delete_dec: function() ...
        max_open: function() ...
        reset: function() ....
    }
}

Everything in your source code should be trying to convey meaning as
quickly and efficiently as possible. All these do is add redundant
information and make the tests a pain to read.

I definitely agree that we could spend some time organizing our test
suite into smaller files. Perhaps one file per current test, and then
split up some of the bigger tests into smaller bits. eunit tests for
the internals is also made of win. I am in no way against stealing
good ideas, but long names just isn't one of them.

HTH,
Paul Davis

> Cheers
> Jan
> --
> Also: Fucking.
>
>> On Thu, Jan 29, 2009 at 5:42 PM, Jan Lehnardt <jan@apache.org> wrote:
>>>
>>> Hi list,
>>>
>>> Alex Lang and I have been working on a statistics module this week.
>>> We'd like to share our intermediate results and discuss further steps
>>> and acceptance of the module into the CouchDB codebase. The code
>>> lives on GitHub:
>>>
>>>  http://github.com/janl/couchdb/tree/stats
>>>
>>> A full diff against trunk from a few hours ago lives on Friendpaste:
>>>
>>>  http://friendpaste.com/3IfIyRv5EzXEnPyk7S9xqe
>>>
>>> Be gentle :)
>>>
>>> The driving idea was to make it easy for 3rd party monitoring solutions
>>> to pick up runtime statistics of a single CouchDB node. The stats module
>>> does not collect long-term stats persistently. It does not draw pretty
>>> graphs
>>> and it does not come with a pony.
>>>
>>> The stats module is rather simple, it consists of two Erlang modules,
>>> a collector and an aggregator. The collector is a `gen_server` holding
>>> and `ets` table that holds integer values associated with keys. For now
>>> these are simple counters. All CouchDB modules can send a message
>>> to the collector with any key they want to get a metric counted. E.g.
>>> `couch_httpd`:
>>>
>>> couch_stats_collector:increment({httpd, requests}).
>>>
>>> to count the total number of requests made.
>>>
>>> We plan to add support for minimum and maximum values of
>>> different metrics soon.
>>>
>>> The aggregator has two jobs: It exposes the raw counter values from
>>> the collector to the HTTP plugin (`couch_httpd_stats_handlers.erl`)
>>> and to aggregate counter values to more meaningful numbers like
>>> average requests per second over 60 seconds, 300 seconds, 900
>>> seconds (cron-style).
>>>
>>> Reading values is done through a single API endpoint `/_stats`. The
>>> canonical format is `/_stats/modulename/statkey` e.g.:
>>>
>>> /_stats/httpd/requests
>>>
>>> will return a JSON string:
>>>
>>> {"httpd": {"requests":3241}}
>>>
>>> We'll have a combined view for all stats under `GET /_stats`.
>>>
>>>
>>> A few notes:
>>>
>>> - Stats URLs are read-only.
>>>
>>> - This is a first iteration, we can add as many fancy stats and metrics
>>> as
>>>  we like, we went with the ones that felt most useful for us.
>>>
>>> - The `{module, key}` identifiers for stats are not yet finalized, we
>>> realized
>>>  early on that we'd like to have a fair number of stats before looking
>>>  into a good naming scheme. We don't have a good naming scheme
>>>  yet, recommendations are very welcome.
>>>
>>> - We'll add more inline-comments to key parts of the code.
>>>
>>> - We're not yet fully integrated into the build system (see below). eunit
>>>  offers a variety of options to conditionally compile code for testing
>>> and
>>>  moving test code to separate files if that is preferred.
>>>
>>> - The way we calculate averages works, but is a bit clunky and can be
>>>  improved. Also, different kinds of averages can be added.
>>>
>>> - Stats are lost on server shutdown or when restartServer(); is called
>>> wich
>>>  the JavaScript test suite does.
>>>
>>> - Some lines exceed the soft limit of 79 chars. We'll address that in the
>>>  future, if required.
>>>
>>> - Basic testing shows no significant slowdowns so far. Proper load
>>> testing
>>>  has not been done yet.
>>>
>>> Test Driven
>>>
>>> - We took a test driven approach developing this. Alex is an expert on
>>>  TDD (with high success rates, but that's another story). We added
>>>  an extensive number of tests to `couch_tests.js` that differ in style
>>>  from the existing tests and are more oriented towards proven practices.
>>>  They are also more verbose and try to test only one thing at a time.
>>>  We hope you like the new style and we'd like to nudge the devs into
>>>  taking over our style. The infrastructure for the new-style tests is not
>>>  final and we can still change things around, if needed.
>>>
>>> - For things that were hard or impossible to test through the REST
>>>  API we added eunit-style tests to our core modules that ensure their
>>>  functionality. We added a temporary make target `make t` to launch
>>>  our tests. You need Erlang/OTP R12B-5 for that or an existing euint
>>>  installation. We'd like to work with Noah to integrate that into the
>>>  existing `make test` target (and integrate the existing bits).
>>>
>>>
>>> Review
>>>
>>> We're looking for a validation of our architectural approach as well as
>>> integrating our `increment` calls into the existing CouchDB codebase.
>>>
>>>
>>> Legal
>>>
>>> We're releasing the source under the Apache 2.0 license. I (Jan) have
>>> a CLA on file, Alex does not. If I remember correctly, my CLA is enough
>>> to get the code into the hands of the ASF. Alex will receive honourable
>>> mentions in our THANKS file. :)
>>>
>>> The stats module has been developed under a contract with the BBC.
>>> They need to be able to integrate CouchDB with their existing monitoring
>>> solution and we hope this will get them there. The BBC has no interest
>>> in maintaining the code as a patch and has subsequently allowed us
>>> to release the code as open source under the Apache 2.0 license and
>>> asked if we could offer it to the CouchDB community for potential
>>> inclusion.
>>> By ASF law there still might have to a occur a proper software grant by
>>> the BBC, but take that as a given.
>>>
>>>
>>> Next Steps
>>>
>>> - We'd like to hammer out any architectural issues you might find.
>>>
>>> - We'll work on a decent naming scheme for stats.
>>>
>>> - We'll clean up code, add tests, and comments.
>>>
>>> - We'll add more stats and the missing bits we mentioned earlier.
>>>
>>> - We might end up adding SNMP support.
>>>
>>> - Once in some shape (and with all legal issues cleared), we'd like
>>> to move the code to an ASF subversion branch and finish integration
>>> into trunk from there. Hopefully before 0.9.
>>>
>>>
>>> Thanks for your time,
>>>
>>> Cheers
>>> Alex & Jan
>>> --
>>> PS from my PMC-hat: I think the stats module is a valuable addition
>>> to CouchDB and I welcome any effort to integrate it.
>>>
>>>
>>
>
>

Mime
View raw message