couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Create a view with only unique records
Date Mon, 14 Apr 2008 19:48:15 GMT

On Apr 14, 2008, at 17:19, Jan Lehnardt wrote:
>
> On Apr 14, 2008, at 02:34, Ralf Nieuwenhuijsen wrote:
>> Well, that doesn't really apply. I am not looking for way to create  
>> unique
>> documents.
>> I'm looking for a way to get a view with only unique documents.
>>
>> Imagine some portion of all the documents having the key 'adres'.
>> Then I want a list of unique adresses; a view with only the adres  
>> keys for
>> documents that have it, and then only unique entries.
>>
>> It seems currently i can solve this problem in two ways:
>> - creating a separate adres document that stores an array of all  
>> unique
>> addresses. But without any sane default merging behavior, this  
>> breaks at
>> replication.
>> - creating a separate document for _each_ adres using put and the  
>> md5 of
>> the adres of doc-id. This seems like an enormous waste of space.  
>> Esspcially
>> since I will be doing this with almost every key in every document.
>>
>> In the future this should be doable with the reduce/combinator  
>> behavior, i
>> expect.But even there, i think the suggested approach is too  
>> limiting. The
>> reducer is going to return one json object. I would rather have it  
>> emit
>> (key,value) and use default view operations on it for stuff like  
>> pagination.
>>
>> Using the above example and assuming the reducer is implemented.  
>> How to get
>> the X most used addresses? the value of X needs to be hard-coded  
>> with the
>> suggested implemenation. Whereas using emit(key,value) in the  
>> reducer as
>> well, would allow for pagination.
>
> I might be totally off here, but the reduce function actually does  
> only return one key-value pair for the view:
>
> map: /* _id = md5(address) */
> function(doc) {
>  emit(doc._id, 1);
> }
>
> produces:
>
> abc | 1
> abc | 1
> def | 1
> xyz | 1
> yyy | 1
> yyy | 1
> yyy | 1
>
> for fictional _id values.
>
> reduce:
> function(keys, values) {
>  var sum = 0;
>  for(var i in values) {
>    sum += values[i];
>  }
>
>  return sum;
> }
>
> produces:
>
> abc | 2
> def | 1
> xyz | 1
> yyy | 3
>
> as the output of the view, which can be paginated just as easy as  
> the list that map alone produces. This gives you a count for all  
> addresses but not yet a sorted list. got to think about that one a  
> bit more.

I checked back with Damien and we can't do that now. You'd need to  
collate that reduce result in your application or use Lucene or some  
other technology to do that for you.

Cheers
Jan
--





Mime
View raw message