Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 48809 invoked from network); 21 Jan 2009 04:05:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 Jan 2009 04:05:53 -0000 Received: (qmail 39357 invoked by uid 500); 21 Jan 2009 04:05:50 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 39305 invoked by uid 500); 21 Jan 2009 04:05:50 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 39294 invoked by uid 99); 21 Jan 2009 04:05:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Jan 2009 20:05:50 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of awolff@gmail.com designates 74.125.92.24 as permitted sender) Received: from [74.125.92.24] (HELO qw-out-2122.google.com) (74.125.92.24) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Jan 2009 04:05:39 +0000 Received: by qw-out-2122.google.com with SMTP id 5so760554qwi.29 for ; Tue, 20 Jan 2009 20:05:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=BOAhKgJsmXWltHA0mFK6br8T2aoQ3qAWM31c3j6K26Y=; b=Z8fW2IMgrPeccWIwpjzBo6D0bWRXPisCG/66ToJD782G9Cf56pazrk5KF7xhFn/xtZ gJ9R3eRyDnpOOXG9OykD0TqPNw4yVKNqsuSfx4By4RIa9YpsXmxbV8HmoJSCmgjODnne e2UkBF+mMA0wVkhH8FurDnYkWzw5Qw+x7il7M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=OuLYTDJ51QQNZUMzYlqLFt0NMmtPOzbMA1Cw7ZPmkwBxgnf5A9JaAC8O7o1VZJDGUz logG939elrvR3GiFXixukROZafL91ywLpQ7pZ3k5nsNAKpsVx3vtPZQFwNe1a4pnpwvU CWPA0uKeok+2YxxUseoWq84MCdOWxz1YjuBGQ= MIME-Version: 1.0 Received: by 10.214.26.7 with SMTP id 7mr8600446qaz.91.1232510718412; Tue, 20 Jan 2009 20:05:18 -0800 (PST) In-Reply-To: References: <5aaed53f0901201850s186eecd4n9a6e8081441a6c58@mail.gmail.com> Date: Tue, 20 Jan 2009 20:05:18 -0800 Message-ID: Subject: Re: reduce/rereduce confusion From: Adam Wolff To: user@couchdb.apache.org, tech@dundeemt.com Cc: couchdb-user@incubator.apache.org Content-Type: multipart/alternative; boundary=0015175cda788d89740460f6436b X-Virus-Checked: Checked by ClamAV on apache.org --0015175cda788d89740460f6436b Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit After looking at this more, let me restate. I would totally get all of this if the signature of reduce was:reduce: function(key, values, rereduce) What I don't get is: why does reduce get called with an arbitrarily long list of keys? I thought reduce was precisely for reducing all of the mapped inputs that are indexed under the *same* key. I think if I can get that, the rest will come clear. Thanks again, A On Tue, Jan 20, 2009 at 7:52 PM, Adam Wolff wrote: > Thanks for the reply! > I'd seen all of this, though I re-read the wikipedia entry carefully. > Damien's blog entries don't appear to match the APIs in the version I'm > running, which is 0.8.1 > The wikipedia entry suggests that reduce is called only with values that > match a single key. Using the log() function in CouchDB, I can see that's > not the case for its reduce function -- it's called with multiple different > keys, though it does appear that the input values are *ordered* by matching > keys. > > Anyway, I totally get how re-reduce (or "combine") works in conventional > map/reduce, but I'm hazy on the details w/r/t to CouchDB. I'm starting to > understand the answer to #1, but I'm really unclear on #2 (how/why rereduce > is run.) > > Thanks again, > A > > > On Tue, Jan 20, 2009 at 6:50 PM, Jeff Hinrichs - DM&T wrote: > >> On Tue, Jan 20, 2009 at 7:47 PM, Adam Wolff wrote: >> > Hi everyone,I'm really excited about CouchDB and I've started playing >> with >> > it. I get all of it, except for reduce, and especially re-reduce. >> > >> > My first question is: how does CouchDB maintain all the separate output >> for >> > a given key from the map function? I mean: given a simple reduce that >> just >> > sums results, how does couch maintain separate results for each possible >> > key/key range that can be given as input to that view? >> > >> > My second question: when and why does rereduce get called? Is this >> simply to >> > allow the server to chunk the processing, or is there semantic meaning >> to >> > it? I had assumed the former -- it's just a way of limiting the size of >> the >> > input to the reduce function -- but then this really confused me: if I >> log >> > each time my reduce function gets called, I see that the last time it's >> > called, it's with rereduce=false. How is this possible? Don't all the >> > results have to be funneled through rereduce to produce a single result >> > value? >> > >> > Any help here would be much appreciated. If there's a resource on the >> web I >> > should look at, please send it my way. Thanks! >> > >> > A >> Being that I just went through the learning process on reduce, I'll >> point you here: >> http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views >> "Reduce Functions" >> >> As a good place to start. >> Also, the mailing list, is an excellent resource. >> >> http://mail-archives.apache.org/mod_mbox/couchdb-user/200901.mbox/%3c61B374C7-34D7-45C3-9F8B-F11EFD77303D@apache.org%3e >> >> along with: >> http://en.wikipedia.org/wiki/MapReduce >> http://labs.google.com/papers/mapreduce.html >> and >> http://damienkatz.net/2008/02/incremental_map.html >> >> Regards, >> >> Jeff >> > > --0015175cda788d89740460f6436b--