couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <robert.new...@gmail.com>
Subject Re: Map/Reduce Question
Date Thu, 02 Dec 2010 11:46:37 GMT
The simplest means to dedupe this is;

function(keys, values, rereduce) {
  return values[0];
}

This assumes that all values for the same key are identical, but I
think that's what you're saying.

B.

On Thu, Dec 2, 2010 at 12:43 AM, Matthew Woodward <matt@mattwoodward.com> wrote:
> I'm catching on to the map bit of map/reduce decently, but now that I need
> to reduce something I'm having some issues, so I'm hoping someone can steer
> me in the right direction.
>
> I have a view that outputs a key, and then an array as the value using the
> following map function:
> {
>    "viewname": {
>        "map":"function(doc) {
>            var value;
>            if (doc.foo != '' && (doc.bar != '' || doc.baz != '')) {
>                value = [doc.bar, doc.baz];
>                emit(doc.foo, value);
>            }
>        }"
>    }
> }
>
> To explain that a bit--basically in my documents foo (which is my key for
> this map function) might be a zero-length string, in which case I don't want
> to output that document's information. Additionally if bar and baz are both
> zero-length strings I don't want the document included in that case either,
> but if either bar or baz has a value then I want to include it. And then my
> value is an array of bar and baz. This is all working great.
>
> The issue is that I have numerous duplicates in my output, i.e. where foo,
> bar, and baz have the same values as another document. This is to be
> expected in the documents themselves so there's no issue with the data.
>
> For the purposes of this view, however, I only want to output unique results
> for each value of foo (my key).
>
> To use a concrete example, let's say currently using the map function above
> I'm getting this output:
> {"total_rows":3800,"offset":0,"rows":[
> {"id":"guid1","key":"key1","value":["value1", "value2"]},
> {"id":"guid2","key":"key1","value":["value1", "value2"]},
> {"id":"guid3","key":"key2","value":["value1", "value2"]},
> {"id":"guid4","key":"key2","value":["value1", "value2"]},
> ... etc. ...
> ]}
>
> What I need to wind up with is this:
> {"total_rows":3800,"offset":0,"rows":[
> {"id":"guid1","key":"key1","value":["value1", "value2"]},
> {"id":"guid3","key":"key2","value":["value1", "value2"]},
> ... etc. ...
> ]}
>
> In other words, if the key and value are identical across records I want to
> only output one result, but if the value is the same as another document
> *but the key is different*, then I do want to include it in the output. Hope
> I'm explaining that clearly.
>
> Happy to clarify further, and really appreciate any suggestions anyone has.
>
> Thanks!
> --
> Matthew Woodward
> matt@mattwoodward.com
> http://blog.mattwoodward.com
> identi.ca / Twitter: @mpwoodward
>
> Please do not send me proprietary file formats such as Word, PowerPoint,
> etc. as attachments.
> http://www.gnu.org/philosophy/no-word-attachments.html
>

Mime
View raw message