incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Stevens (Gmail)" <wickedg...@gmail.com>
Subject Re: how to count the number of unique values
Date Sat, 16 Oct 2010 17:57:17 GMT
Assuming you have the 'works' docs contain a type and a list of
subject IDs (code untested, sorry):

map = function(doc) {
  if (doc.type == 'work') {
    for (i in doc.subject_ids) {
        emit(doc.subject_ids[i], [doc._id]); // returning a list of a
single doc._id makes it so that the reduce function is simpler; it's
not required though.
    }
  }
}

reduce = function (key, values, rereduce) {
  var combinedList = [];
  for (i in values) {
    combinedList[combinedList.length] = values[i];
  }
  return combinedList;
}

This produces a view with rows like:

{key: 'subj_id1', value: ['work_id1', 'work_id2', ...]},
{key: 'subj_id2', value: ['work_id2', 'work_id3', ...]},
{key: 'subj_id3', value: ['work_id1', 'work_id4', ...]},

Does that help?

Eli

On Sat, Oct 16, 2010 at 8:04 AM, Anand Chitipothu <anandology@gmail.com> wrote:
> 2010/10/15 Wout Mertens <wout.mertens@gmail.com>:
>> Just wanted to add that if you have a map function that emits (tag, 1) for each tag
and then a reduce function that's just _count, you will have everything you need for painting
a tag cloud.
>>
>> The view with group=true will list all tags exactly once, with their count. CouchDB
doesn't tell you how many rows are in the result so you'll have to count them yourself.
>>
>> So you load that entire view in memory and you can draw the tags with their relative
sizes.
>>
>> Wout.
>
> The example I gave is a rather simplified example. I'm working a data
> containing 25M+ docs with books, works and subjects. I need to find
> the list/count of works for each subject. I don't think it is
> practical to load the view into memory to compute the required result.
>
> Anand
>

Mime
View raw message