couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Clairon <clai...@gmail.com>
Subject Re: group_level and sorting
Date Mon, 20 Apr 2009 07:51:29 GMT
First, thank you Jason and Wout for clarifing to me the use of the
reduce. It's a behaviour that is not
trivial to understand for me.

I think the time has come to explain more about my work. I'll try the
best I can with my poor written english ability.

= Creation of documents =


Say there is 3 users in the application. Each user can write a same
document A (by same, I mean same
structure). Here's the structure of the DocumentA :

DocumentA = {
   _id:  ...,
  _rev: ...,
  type: "documentA",
  title; ...,
  description: ...,
  tags: ..., // a list of tags
  date_creation: ...
}

Now the users create some DocumentA :

user1 => {
  "title": "title1",
  "description":"a document A",
  "tags": [],
  "date_creation": 1
}
user2 => {
  "title": "title1",
  "description": "another description",
  "tags": ["tag1"],
  "date_creation": 2
}
user3 => {
  "title": "title1",
  "description": "a document A",
  "tags":["tag1","tag2"],
  "date_creation": 3
}


Note that user1, user2 and user3 create the same document A (same
title). They can create more documentA
like :

{
  "title": "title2",
  "description": null,
  "tags": ["tag3"],
  "date_creation": 2
}

but we don't care about the other. We'are concerned only by the
documents A with title1


= Reducing to get summary =

So, we have 3 documents A (with "title1" as title) but I want to
display only one to my users.
The displayed document will show the most releval informations (the
most use description) and
all the tags. So I created a reduce function wich summaries all the 3
documents into a single one :

reduced_documentA = {
  "title":"title1",
  "description":"a document A",
  "tags":["tag1","tag2","tag3"],
  "date_creation": 1
}

Note that the reduced document took the earlier date_creation.

= Displaying the reduce document =

The challenge now, is to display the reduced document by tags then by
date_creation.

That's all.

How can I do that ? If I use a map function :

  emit([doc.tag[t], doc.date_creation], doc.title);

i will have duplication.

I thought that I can user external process for that. Each time the
reduced_document
is updated, external process will store the reduced_document as
regular document in
another db (so that I can query it simplier).

What do you think ? It is the way to go ?

Thank you for reading me until here.

Nicolas



On Sun, Apr 19, 2009 at 12:02 PM, Wout Mertens <wout.mertens@gmail.com> wrote:
> It's not quite clear exactly what you want. I see that you are making your
> value list unique, so I'm assuming that you want some sort of unique or last
> function?
>
>
> I notice a mistake in your code: when creating a reduce function, a
> re-reduce should behave in the same way as the regular reduce. The reason is
> that CouchDB doesn't necessarily call re-reduce on your map results.
>
> Think about it this way: If you have a bunch of values V1 V2 V3 for key K,
> then you can get the combined result either by calling
> reduce([K,K,K],[V1,V2,V3],0) or by re-reducing the individual results:
> reduce(null,[R1,R2,R3],1). This depends on what your view results look like
> internally.
>
>
> That said, what are you interested in? Do you want the last-changed title or
> do you want all titles in order of appearance?
>
> If the latter, you don't need a reduce at all, you just look at the results
> of your map function in order.
>
> If the former, you'll have to put your date in the value of the map, and in
> the reduce you'll have to decide which title you want to keep for that set
> of values.
>
> Wout.
>
> PS: I changed the wiki so that it has a common mistakes section in the
> ViewSnippets page and I added the reduce mistake.
>
> On Apr 18, 2009, at 6:52 PM, Nicolas Clairon wrote:
>
>> Hi all !
>>
>> I have a question (wich is a big concerne to me and my project) about
>> the group_level option.
>>
>> I want to display all doc by tag and then sorting them by date.
>>
>> The map function :
>> -----------------------------------
>> function(doc){
>>  for(var t in doc.tags){
>>   emit([doc.tags[t], doc.creation_date], doc.title);
>>  }
>> }
>> ------------------------------------
>>
>> * creation_date is a float since the epoch (ie something like this
>> 12423344.003)
>> * docs can have the same title
>>
>> the reduce function:
>> -----------------------------------
>> function(key, values, rereduce){
>>   var results = [];
>>   if(!rereduce){
>>       for( var v in values){
>>           if (results.indexOf( values[v] ) < 0){results.push(values[v]);}
>>       }
>>   }
>>   else{
>>       for( var i in values){
>>           for(var e in values[i]){
>>               results.push(values[i][e]);
>>           }
>>       }
>>   }
>>   return results;
>> }
>> -----------------------------------
>>
>> $ curl
>> http://localhost:5984/db/_design/foo/_view/by_tag_sort_by_date?reduce=false
>>
>> returns :
>>
>>
>> {"id":"8075ba2ef7418f4d6c9a3e89be83acd8","key":["tag1",1239361935.000004],"value":"title2"},
>>
>> {"id":"8d9132318a6c34c646e9e2cd43823ffa","key":["tag1",1239794744.000002],"value":"title1"},
>>
>> {"id":"f49a28ffd2118298c1be7440ec4556fa","key":["tag2",1239794744.000002],"value":"title1"},
>>
>> this is ok because title1 is newer than title2. But now, I want all
>> displayed by tag so I use the group_level :
>>
>> $ curl
>> http://localhost:5984/db/_design/foo/_view/by_tag_sort_by_date?group_level=1
>>
>> {"key":["tag1"],"value":["title1","title2"]},
>> {"key":["tag2"],"value":["title1"]},
>>
>> I have all titles by tag but the docs is not sorted by date anymore...
>>
>> Does group_level keeps the absolute sorting ? Does the sort break anyway ?
>>
>> Thanks,
>>
>> Nicolas
>
>

Mime
View raw message