couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kinley Dorji <kinl...@gmail.com>
Subject Re: filtering on timestamp + aggregation on another field
Date Tue, 15 Mar 2011 04:29:42 GMT
Aroj,

I think you will always have choices on how to implement it, with the
final decision resting on CouchDB efficiencies (choice of what should
be keyed and what should be included in values, as Nils has noted) and
what your reporting needs are. Here is another option:

At the map level: emit(doc.timestamp, {doc.country, doc.city,
doc.clinic, doc.beds})

At the reduce level, write your js code to suit your aggregation requirements.

At the view query level, in addition to the start end keys for the
timestamp, add the parameters &group_level=1, 2, 3 etc.

If you select group_level 1 your aggregation would be for country, 2
for city and 3 at the clinic level.

This option gives you choices at query time, but whether it is
suitable for you again depends on the specifics of your requirements.

Similarly, you are presently reflecting timestamp as a date. If you
were to make a compound key for timestamp ie. {year, month, day,
hour}, you could use the same view to query by year and month (which
probably will not be useful from the view point of bed availability)
but might be useful at the hour level (assuming updates on bed
availability are dynamic/real time).

My two pennies.

On Tue, Mar 15, 2011 at 12:28 AM, Nils Breunese <N.Breunese@vpro.nl> wrote:
> This looks fine to me. To keep the index storage to a minimum I wouldn't store the doc
as the value in the view, but only the absolute minimum you need. Hint: the value can even
be null (which doesn't take a lot of space to store!) and you can use ?include_docs=true to
retrieve the documents with the view query. We use this for almost all of our views, so storage
is mostly going to storing the documents themselves and views add little overhead.
>
> Nils.
> ________________________________________
> Van: Aroj George [arojis@gmail.com]
> Verzonden: maandag 14 maart 2011 19:13
> Aan: user@couchdb.apache.org; Kinley Dorji
> Onderwerp: Re: filtering on timestamp + aggregation on another field
>
> Thanks for the below.
>
> Another option we came up with is as below,
>
> map:
> for each level in the location hierarchy:
>     emit([level,timestamp],doc)
>
> which will produce something like the below,
>
> *for given documents:*
> { timestamp : 01/01/2011, location : [India, Maharashtra,Pune] , other_attrs
> }
> { timestamp : 01/02/2011, location : [India, Maharashtra,Mumbai] ,
> other_attrs }
>
> *Map output:*
> 1. [India,01/01/2011], doc
> 2. [Maharashtra,01/01/2011], doc
> 3. [Pune,01/01/2011], doc
> 4. [India,01/02/2011], doc
> 5. [Maharashtra,01/02/2011], doc
> 6. [Mumbai,01/02/2011], doc
>
> Now we can have a query like,
> startkey=[India,01/01/2011] & endkey=[India,01/03/2011] & group_level=1
>
> which should give me the documents grouped on India but filtered on
> timestamp..
>
> The question is, is this a good solution? One concern being the number of
> records in the view now is number_of_levels * num_documents
> ie in this case 2 documents * 3 levels = 6 records in the view.
>
> Will couch performance suffer with this approach?
>
> Rgds,
> Aroj
> ------------------------------------------------------------------------
>  VPRO   www.vpro.nl
> ------------------------------------------------------------------------
>

Mime
View raw message