incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J Chris Anderson <jch...@gmail.com>
Subject Re: Is there anyway to specify a group_level of "id"?
Date Mon, 19 Apr 2010 16:50:57 GMT

On Apr 19, 2010, at 9:27 AM, Jarrod Roberson wrote:

> On Mon, Apr 19, 2010 at 10:10 AM, Adam Kocoloski <kocolosk@apache.org> wrote:
>> On Apr 18, 2010, at 1:37 AM, Jarrod Roberson wrote:
>> 
>> 
>> Hi Jarrod, I'd need a little more detail or an example before I could whether what
you want to do is possible.  Best,
>> 
>> Adam
> 
> I am working on what I think is a clever solution to not being able to
> do variable "select where" sql like selections on CouchDB.
> 
> here is my map function
> 
> function(doc)
> {
>  emit(['cnnid', doc.cnnid], null);
>  emit(['guid', doc.guid], null);
>  emit(['src', doc.sourceServer], null);
>  emit(['dest', doc.destServer], null);
> }
> 
> running a reduce that works with group_level=1 it is merging lists of
> _ids by field. what the result is a unique list of _ids that match
> each field name when run with keys=[['cnnid',"11111111"],["src","a"]].
> I get output that groups by cnnid and src what I want to do is
> rereduce just the "final" output one more time to reduce the keys down
> to the unique list of _ids from the resulting groups.
> 

I think this is misuse of CouchDB's reduce, which is really just there to provide numerical
aggregation (or other constant-space operations). I'm surprised you haven't been hit with
the reduce_overflow_error yet. As your database becomes larger, this will surely happen.

If you want to do something like this, you are better off moving all your uniqueness logic
to a _list function, and then optimizing your map collation to keep the memory usage of your
_list low.

Chris

> 
> curl -X POST -d '{"keys":[["cnnid","82534864"],["src","a"]]}'
> http://localhost:5984/transfer_central/_design/transfer/_view/search?group=true&group_level=1
> looks like this
> 
> {"rows":[
> {"key":["cnnid","82534864"],"value":["fdbc746e0026B93BD6FE6f83c80de090","fdbc6f930026B92F3075118c8e46f574","fdbc59760026B92F30754e88a5fb1d0a"]},
> {"key":["src","a"],"value":["2fe5b7620026B93BD6FE54240135cf78","3028f1010026B93BD6FE27430d5ff179","3028f3df0026B93BD6FE1aaf792acaec","48a5d7ab0026B93BD6FE347e40beba22","48a5dada0026B93BD6FE5759f3f61946","48a8630c0026B93BD6FE6bc36f72aaf9","673fd21a0026B93BD6FE56b473da6a77","673fd4790026B93BD6FE47aeffcaa16b","67af7dbc0026B93BD6FE3134aabda4b6","67af80b60026B93BD6FE132454bf17b5"]}
> ]}
> 
> I tried group_level=0 but that does the same thing as 1 in my case. I
> tried hacking at the Erlang source to get it to run a another final
> rereduce with a "special" group_level=9999 but I didn't have any luck
> getting that to work. What I finally resorted to is a List function
> that does the final reduce/merge of the _ids (same thing the reduce
> function is doing really) but I really think it would be clever if I
> could get the reduce to run one last rereduce instead of the List
> function solution.
> 
> Here is a post about it in more detail
> http://www.vertigrated.com/blog/2010/04/where-clauses-like-selects-against-couchdb/


Mime
View raw message