couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: counting tags within a date range
Date Mon, 01 Mar 2010 16:43:09 GMT
On Mon, Mar 01, 2010 at 02:33:46PM +0100, Borja Martín wrote:
> I have these documents :
> 
> { "created_at": "20100301", "tag": "foo" },
> { "created_at": "20100301", "tag": "bar" },
> { "created_at": "20100301", "tag": "foo-bar" },
> { "created_at": "20100302", "tag": "foo" }
> 
> and what I want is to retrieve the documents within a date range and count
> how many times does each tag appear globally, not just by its date. I should
> get something like this:
> { "foo" : 2, "bar" : 1, "foo-bar" : 1}

If the number of distinct tags in your database is small (say < 100), then
you can use a reduce function to build a map of {tag:count} explicitly. Then
a grouped query across any range of dates will give you the map you are
looking for.

Otherwise, I suggest you group by [date,tag] as before and do the summation
on the client side. That is, with a _count reduce function and
startkey=["20100301"]&endkey=["20100302",{}}&group=true
you should get
  "key":["20100301","foo"], "value":1
  "key":["20100301","bar"], "value":1
  "key":["20100301","foo-bar"], "value":1
  "key":["20100302","foo"], "value":1
and you can add the counts from the [x,tag] rows yourself. If you want to do
this server-side you can use a _list function to do the accumulation.

Depending on how large your date ranges are, you can make more complex
solutions using larger buckets.  For example, have another view which emits
["201003","foo"] as a key, to allow you to sum all the tags in March 2010. 
So searching from 6 April 2009 to 5 April 2010 might require three queries
(one each for the partial months at each end, and one for the whole months
in between)

HTH,

Brian.

Mime
View raw message