incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Woodward <m...@mattwoodward.com>
Subject Map/Reduce Question
Date Thu, 02 Dec 2010 00:43:46 GMT
I'm catching on to the map bit of map/reduce decently, but now that I need
to reduce something I'm having some issues, so I'm hoping someone can steer
me in the right direction.

I have a view that outputs a key, and then an array as the value using the
following map function:
{
    "viewname": {
        "map":"function(doc) {
            var value;
            if (doc.foo != '' && (doc.bar != '' || doc.baz != '')) {
                value = [doc.bar, doc.baz];
                emit(doc.foo, value);
            }
        }"
    }
}

To explain that a bit--basically in my documents foo (which is my key for
this map function) might be a zero-length string, in which case I don't want
to output that document's information. Additionally if bar and baz are both
zero-length strings I don't want the document included in that case either,
but if either bar or baz has a value then I want to include it. And then my
value is an array of bar and baz. This is all working great.

The issue is that I have numerous duplicates in my output, i.e. where foo,
bar, and baz have the same values as another document. This is to be
expected in the documents themselves so there's no issue with the data.

For the purposes of this view, however, I only want to output unique results
for each value of foo (my key).

To use a concrete example, let's say currently using the map function above
I'm getting this output:
{"total_rows":3800,"offset":0,"rows":[
{"id":"guid1","key":"key1","value":["value1", "value2"]},
{"id":"guid2","key":"key1","value":["value1", "value2"]},
{"id":"guid3","key":"key2","value":["value1", "value2"]},
{"id":"guid4","key":"key2","value":["value1", "value2"]},
... etc. ...
]}

What I need to wind up with is this:
{"total_rows":3800,"offset":0,"rows":[
{"id":"guid1","key":"key1","value":["value1", "value2"]},
{"id":"guid3","key":"key2","value":["value1", "value2"]},
... etc. ...
]}

In other words, if the key and value are identical across records I want to
only output one result, but if the value is the same as another document
*but the key is different*, then I do want to include it in the output. Hope
I'm explaining that clearly.

Happy to clarify further, and really appreciate any suggestions anyone has.

Thanks!
-- 
Matthew Woodward
matt@mattwoodward.com
http://blog.mattwoodward.com
identi.ca / Twitter: @mpwoodward

Please do not send me proprietary file formats such as Word, PowerPoint,
etc. as attachments.
http://www.gnu.org/philosophy/no-word-attachments.html

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message