incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: [user] Obtaining unique values from a view
Date Tue, 03 Mar 2009 17:00:15 GMT
On Tue, Mar 3, 2009 at 6:52 AM, Wout Mertens <wmertens@cisco.com> wrote:
>>> Since this question has been posed more than once, maybe main.js should
>>> have a uniq() function as well?

Dragons dragons...

This is one pattern Couch does not support (and it is not unique in this way).

When I first started working with CouchDB, I really wanted to take maps like

a, 1
a, 5
a, 8
b, 2
b, 6
b, 6
b, 7

and use reduce to turn them into:

a: 1, 5, 8
b: 2, 6, 7

Don't do this!

It's just taking a tall list and making it into a wide list. The
disadvantage of a wide list is that you have to have the whole thing
in memory at once. This is where Couch breaks down, because the
spidermonkey process eventually has to have all the unique rows of the
map in memory all at once.

It's fine if your map has 10-ish unique keys, but even at 100-ish
unique keys, reduces will start to time out. Remember that the above
lists I showed, will turn into something like this when the final
reduction is calculated:

1,2,5,6,7,8

Which as you can see could become a very large amount of data on
real-life datasets (which probably would have thousands of values at
full-reduce).

You can use the log(data) function in your map and reduce functions
(and watch couch.log) to see how the tall list just gets turned into
the wide list, with functions like this.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Mime
View raw message