incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torstein Krause Johansen <torsteinkrausew...@gmail.com>
Subject Complex queries & results
Date Thu, 26 May 2011 07:12:14 GMT
Hi all,

I have problems solving the following problem with CouchDB and am 
wondering if I'm trying to solve something for which Couch isn't 
suitable, if there is something I have misunderstood or if there's some 
hidden feature I haven't discovered yet.

I have documents with the following fields:
{
   one_id : 1,
   another_id : 22,
   created_at : "2011-05-26",
   a_name : "Lisa"
}

I want to search all occurrences with a combination of the three first 
ones as query parameters and then count the number of a_name occurrences 
within each of these search collections. For this reason, I put this 
into my view/map.js: emit([one_id, another_id, created_at], a_name);

Now, using these keys and start/end key, I get the result rows I want. 
So far so good.

My next step, is that I want to count the number of a_name within each 
of these hits, producing a dictionary like:
{
   "John" : 234142,
   "Dominique" : 21177,
   "Lisa" : 123
}

Initially, I tried to do this with a reduce.js, but couldn't work out 
how I'd go about this. The documentation I've read on reduces only 
mentions simple (built in) functions for counting and summing up the 
total rows and what I want here are counts based on the values 
themselves as "keys" in the view's result.

I've managed to get working using (exploiting?) lists, but this doesn't 
scale well with 100 000s of rows.

For these reasons, I've resorted to doing two view operations, one to 
get the initial results and one to get the count of each a_name within 
the first result. This works, but doesn't feel optimal. Also, the 
returned dataset of the first search is overwhelming, leading to a ~5-7 
second download of the data (and putting nginx/gzip infront of Couch 
didn't improve matters enough :-)

The total time it takes to do my two queries adds up to ~6-9 seconds, 
something which is not fast enough for my application and I am therefore 
seeking your guidance.

Cheers,

-Torstein







Mime
View raw message