couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <robert.new...@gmail.com>
Subject Re: Struggling with a particular Map / Reduce
Date Tue, 17 Aug 2010 10:37:30 GMT
If you emit([doc.docAuthor, doc.titles[title]], 1) instead you could
use the built-in Erlang reduce function "_sum" instead, which is
faster.

B.

On Tue, Aug 17, 2010 at 10:24 AM, Martin Higham <martin@ocasta.co.uk> wrote:
> I think it would be better to use the View to split the titles and create a
> list of Authors and Titles. A Map function such as
>
> function(doc) {
>  for (title in doc.titles)
>      emit([doc.docAuthor, doc.titles[title]], null);
> }
>
> does just this.
>
> You now have a list of keys in the form [Author, title] and they are sorted
> by Author.
>
> It's easy to then take these and produce a list of unique Author/title
> combinations and a count of their frequency with the Reduce function.
>
> function(keys, values, rereduce) {
>  if (rereduce) {
>    return sum(values);
>  }
>  else {
>    return values.length;
>  }
> }
>
> However it is difficult for reduce to produce a list of the top 3. Any
> processing within the Reduce can only operate on the data passed in. It
> doesn't know what data is yet to come. If you were to output only the top 3
> entries passed in to a given invocation of the Reduce you would produce
> inaccurate results as you would potentially throw away rows that might yet
> accumulate into the all time top 3.
>
> Once you have a list of unique Author/title pairs and their frequency you
> can either sort and filter them within the client or within a list function
>
> Hope this helps
>
> Martin
>
>
> On 17 August 2010 09:26, Ian Wootten <i.wootten@gmail.com> wrote:
>
>> Hi Everyone,
>>
>> I was hoping somebody might be able to solve a problem I'm having
>> attempting to implement a view at the moment.
>>
>> Essentially, what it does is to take a collection of documents which
>> each have a single author and a list of names (which a possibly
>> repeated). There may be multiple documents by the same author, with
>> the same names within. Here's an example doc.
>>
>> doc.author
>> doc.titles = ['sometitle', 'someothertitle', 'sometitle, 'anothertitle']
>>
>> I would like to return a list of the top 3 titles across for each
>> author across all documents. I have tried and failed for several days
>> to get this working correctly.
>>
>> So far, my map is as follows, giving the unique titles for a document,
>> not ordered at all:
>>
>> function(doc) {
>>
>>  var unique_titles = [];
>>
>>  for(var i in doc.titles)
>>  {
>>     var count=0;
>>
>>       for(var j in unique_titles)
>>       {
>>         if(doc.titles[i]==unique_titles[j])
>>         {
>>            count++;
>>         }
>>       }
>>
>>       if(count==0)
>>       {
>>         unique_titles.push(doc.titles[i]);
>>       }
>>  }
>>
>>  for(var k=0; k<unique_titles.length;k++)
>>  {
>>    emit(doc.author, unique_titles[k]);
>>  }
>> }
>>
>> My map is as follows, this returns two unique titles from a single
>> document when only a single document exists for an author(I think):
>>
>> function(keys, values, rereduce) {
>>  return values.splice(0,2);
>> }
>>
>> My problem is that:
>>
>> a) I can't return more than 2 items from the values array (if I set
>> the splice length to 3, futon spits back a non-reducing error at me).
>> b) Where multiple documents exist for the same author, in some
>> instances I see a weird multi-dimensional array returned (rather than
>> just two values). For example:
>> [['sometitle','someothertitle'],['anothertitle'],['afurthertitle']]
>>
>> Presumably b) is the result of multiple documents for a single author
>> interfering with one another, though I'm confused as to how I
>> configure my map/reduce in order to get the information I'm after (I
>> also wonder if its even possible).
>>
>> I've attempted to understand the documentation on reduce functions,
>> taking a look at the many examples that exist too, but I'm unable to
>> understand them well enough to solve my problem.
>>
>> I'd appreciate any help on this!
>>
>> Thanks,
>>
>> Ian
>>
>

Mime
View raw message