Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 13845 invoked from network); 28 Apr 2009 14:41:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Apr 2009 14:41:47 -0000 Received: (qmail 97541 invoked by uid 500); 28 Apr 2009 14:41:46 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 97458 invoked by uid 500); 28 Apr 2009 14:41:46 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 97448 invoked by uid 99); 28 Apr 2009 14:41:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Apr 2009 14:41:46 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wout.mertens@gmail.com designates 209.85.219.166 as permitted sender) Received: from [209.85.219.166] (HELO mail-ew0-f166.google.com) (209.85.219.166) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Apr 2009 14:41:38 +0000 Received: by ewy10 with SMTP id 10so680324ewy.11 for ; Tue, 28 Apr 2009 07:41:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:mime-version:subject:date:references :x-mailer; bh=Nt/Qoi3BcBmL3dzZG538hH7frnbfK78oVy3oU8mDl7s=; b=CsevJ45RSCwsfOKkhmPweQ9L4aQ59WFVsE8aC5l1hXBTkijqb4AO6GGJaNeIc3Hkf6 /7PVfHekoFcvPtRK9wVU3kIFTrbK4ILI+PvDzjK7zPz1D03tp7K2j/y6OMiZwsoVBhY1 JinuyY5VT/9n3aLXSwV7dyr01urOA8ANqV7KQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type:mime-version:subject :date:references:x-mailer; b=AKv371oR4MVTfULD1DWFHYUWdXX+IMtnABO3IiU+OvD6jyZCiAL3iB6qR7j6J/rDlA sg1MnnCS9MJNmUAxiy/S1ELq5ztnRyf0pKdDkqaBBnE3ubZ+VUeME/9/xv6bKymlqPjn PjqkE2rn/6kIFuMH8snpggql0zn/b3tOZLCDo= Received: by 10.216.11.212 with SMTP id 62mr870191wex.186.1240929676841; Tue, 28 Apr 2009 07:41:16 -0700 (PDT) Received: from dhcp-peg3-cl31144-254-5-142.cisco.com (dhcp-peg3-cl31144-254-5-142.cisco.com [144.254.5.142]) by mx.google.com with ESMTPS id 28sm2505810eyg.48.2009.04.28.07.41.14 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 28 Apr 2009 07:41:15 -0700 (PDT) Message-Id: From: Wout Mertens To: dev@couchdb.apache.org In-Reply-To: <1E5435D7-F50B-4522-B520-B5D693D8E7F9@apache.org> Content-Type: multipart/signed; boundary=Apple-Mail-54-629467208; micalg=sha1; protocol="application/pkcs7-signature" Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: Proposal: Review DBs Date: Tue, 28 Apr 2009 16:41:11 +0200 References: <6801D7CA-88A7-4D67-82C5-0E912F06DA7C@gmail.com> <20090428094253.GA8733@uk.tiscali.com> <874ECBEB-2DC9-4D4A-823C-1C037815B822@gmail.com> <1E5435D7-F50B-4522-B520-B5D693D8E7F9@apache.org> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-54-629467208 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit On Apr 28, 2009, at 3:46 PM, Adam Kocoloski wrote: > I'm not an expert on the btree code, but I think this statement > >> So whenever a reduce call results in a new value for a b-tree node, >> AND that node is the upper node of a subtree that is completely >> part of a group key, that group key needs to be marked for >> recalculation. > > > misses some updates. I thought things looked something like > > root > | > --------------- > kp1 kp2 > | | > --------- --------- > foo foo bar bar bar bar > > where foo and bar are emitted keys from a map (values not show for > the sake of brevity). kp1 and kp2 hold reduce(foo,foo,bar) and > reduce(bar,bar,bar) respectively, and root holds rereduce(kp1,kp2). That is my understanding as well. > When we request a view with group=true, Couch is smart enough to use > the reduction stored in kp2 directly, but it has to call > reduce(foo,foo) and reduce(bar) on the fly for the nodes underneath > kp1 (and then rereduce the bar reductions from kp1 and kp2 to get > the result for bar). If I interpreted Wout's statement correctly, > it ignores the case where any of the nodes under kp1 change, since > kp1 is not "the upper node of a subtree that is completely part of a > group key". In this case, both foos are the upper node of a subtree that is completely part of a group key. bar isn't. Same graph with the group keys indicated: root | ------BAR------ kp1 kp2 | | --FOO---- --------- foo foo bar bar bar bar So whenever the first bar changes, kp1 needs to be calculated and then BAR is marked as needing updating, since kp1 is a top node under BAR. FOO is only marked for updating when either of the foos change. > I think the problem of tracking which group keys to update can be > made simpler. We really only need to see the key updates coming out > of the incremental map. Couch knows the lists of keys to add and > keys to remove at that point; the set of all unique keys in those > two lists (the "changeset") is the set of group keys that would need > to be updated in the Review DB. Here I'm using "update" in the > general sense of add/remove/change; the way I see it, we could just > query the view for all the keys in the changeset. If the MR view > has no results for a given key, that obviously means delete the > associated document from the Review DB. That is a very good idea! It's not as efficient as marking group keys like I proposed (maps need not necessarily change reduce values), but it's a lot easier to code. > group_levels are a straightforward extension -- we just check what > group_level we'll be using in the Review DB when we calculate the > changeset. Exactly. So the algorithm becomes: - When updating a view, keep track of all mentioned keys in the previous and current map() output, keep only group_level key parts. - After updating the reduce() results, for each of the marked group keys: - If a group key gets removed: - look up doc with key=group key in review db. If exists: - delete doc - If a group key gets added: - look up doc with key=group key in review db. If exists: - set doc.value to the row value - else - create doc with id=group key in string form, key=group key, value=value - If a group key gets updated: - look up doc with key=group key in review db. If exists: - set doc.value to the row value - else - create doc with id=group key in string form, key=group key, value=value Good thinking! Wout. --Apple-Mail-54-629467208 Content-Disposition: attachment; filename=smime.p7s Content-Type: application/pkcs7-signature; name=smime.p7s Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIFPDCCBTgw ggMgoAMCAQICAwbHCTANBgkqhkiG9w0BAQUFADB5MRAwDgYDVQQKEwdSb290IENBMR4wHAYDVQQL ExVodHRwOi8vd3d3LmNhY2VydC5vcmcxIjAgBgNVBAMTGUNBIENlcnQgU2lnbmluZyBBdXRob3Jp dHkxITAfBgkqhkiG9w0BCQEWEnN1cHBvcnRAY2FjZXJ0Lm9yZzAeFw0wOTA0MjUxMjI4MTBaFw0x MTA0MjUxMjI4MTBaMD4xFTATBgNVBAMTDFdvdXQgTWVydGVuczElMCMGCSqGSIb3DQEJARYWV291 dC5NZXJ0ZW5zQGdtYWlsLmNvbTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBANosM2k/ my+7nHgyyByjGOnohtPtm/uTM36j27Rpi0HSa1pfGXPmL8SaKIs8zZH1hrMXhkYv+1d4fMv4NjCX LXoOVpF11NvAT9KVIQWjgiy+RfJ8ikao++d5Y8ce6zmZDK1UpLmqlx5RnyGgPWqZV7u6TQ6PYIS7 NXcl11WwyBNz5hRpAK67vXJa/VYsNFoUTw5AJraizIn1bb1cUtL7z8hifxlBsHzutvjrFaAcnSn7 MZBw/pMh1EaiUrbSOoiCzGhYKPHAJUBHiPu+p35ssWZx5MLgKpqKUPUjzbS4PDjxp/zEud5doXKc e2OHoFuZZ3MIVgINWDRghXZhwCTmm3sCAwEAAaOCAQIwgf8wDAYDVR0TAQH/BAIwADBWBglghkgB hvhCAQ0ESRZHVG8gZ2V0IHlvdXIgb3duIGNlcnRpZmljYXRlIGZvciBGUkVFIGhlYWQgb3ZlciB0 byBodHRwOi8vd3d3LkNBY2VydC5vcmcwQAYDVR0lBDkwNwYIKwYBBQUHAwQGCCsGAQUFBwMCBgor BgEEAYI3CgMEBgorBgEEAYI3CgMDBglghkgBhvhCBAEwMgYIKwYBBQUHAQEEJjAkMCIGCCsGAQUF BzABhhZodHRwOi8vb2NzcC5jYWNlcnQub3JnMCEGA1UdEQQaMBiBFldvdXQuTWVydGVuc0BnbWFp bC5jb20wDQYJKoZIhvcNAQEFBQADggIBAIIKo1V0aEWIL/4ha7fDOJKgrpUkweUPJtBG1Dj7PUmq 0adVE4IPP6F9eSsP8YsqA+5tDh/Li/ogGzlkVJZT8n913ZDl7Z3hi+7lVO7qh9sesxNcJn/+c/Mb XRvqqzngbJNzgJWSl8QBtbG44KBMC/cv9CvZIrC3oX5kCfELTu0TpNYDr9iTUj8hlXpcxzko/3LD edtyJLNF+tSAotpXAd0002N8miPbxkK9sOsAa+MAQbPXU4lWq8fTKyVcO6pge9K90X8Gy51wi/9I lReun4FfhMUdwVigSDvKzjZNwWm3UwAjiBoAjjhfrZQ94CMgz0Bg9FFQlWx7LIK9qGIGXV0oT+YH cD3xf4RS2jqDd9Oe2mjhyHhaLu5Wncsp1lqmiaq3zGmV3abMqQiUzjHyt8BvgGjSEKTsLWEBQqYT tQY9dusrj0PPjgkCrNKT2Fb2ucRKPvxpJU5d5jvL75K/l5S/htUTCmA4DIW9+yZMYe1lmGzzuk5I IMogsR0nc2T+bNxBb7rLiiT9qdV3FiYarxvk4LaLQiNDS1C3fqTnSzPLu1X1WcmxviXKNJ9FwbRs ycIaEut0XCApOt24nxX6Rgl1l9N95T7hiCwAKCsevGvNn4i7wnZU8JoPfUqgPPBgm4udDKAlg2AJ NhxbSq/FS51MGC3RDuf7UxebCK/40jexMYIDMzCCAy8CAQEwgYAweTEQMA4GA1UEChMHUm9vdCBD QTEeMBwGA1UECxMVaHR0cDovL3d3dy5jYWNlcnQub3JnMSIwIAYDVQQDExlDQSBDZXJ0IFNpZ25p bmcgQXV0aG9yaXR5MSEwHwYJKoZIhvcNAQkBFhJzdXBwb3J0QGNhY2VydC5vcmcCAwbHCTAJBgUr DgMCGgUAoIIBhzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0wOTA0 MjgxNDQxMTJaMCMGCSqGSIb3DQEJBDEWBBT4ugDTyl0XEfIgV0iqOU80Q5K6RTCBkQYJKwYBBAGC NxAEMYGDMIGAMHkxEDAOBgNVBAoTB1Jvb3QgQ0ExHjAcBgNVBAsTFWh0dHA6Ly93d3cuY2FjZXJ0 Lm9yZzEiMCAGA1UEAxMZQ0EgQ2VydCBTaWduaW5nIEF1dGhvcml0eTEhMB8GCSqGSIb3DQEJARYS c3VwcG9ydEBjYWNlcnQub3JnAgMGxwkwgZMGCyqGSIb3DQEJEAILMYGDoIGAMHkxEDAOBgNVBAoT B1Jvb3QgQ0ExHjAcBgNVBAsTFWh0dHA6Ly93d3cuY2FjZXJ0Lm9yZzEiMCAGA1UEAxMZQ0EgQ2Vy dCBTaWduaW5nIEF1dGhvcml0eTEhMB8GCSqGSIb3DQEJARYSc3VwcG9ydEBjYWNlcnQub3JnAgMG xwkwDQYJKoZIhvcNAQEBBQAEggEANKq+t6iZAYzfN/b2En5UP7CrP6W/cY11bnMBjc2Va6ieUo2N /z3ZxVbwm6bVAle9tpSXEEXKM8kwW3eLAHAqiix11Jwv6Ttb5hw7/xB7bnIGbWA7BQeyDjQxLqEr U4FobcWhbJ4i9gM3TK7m8ttYHgHOf0wNNXJvAnepyabOvvJ9X/XihUke22NF1qj6ng9LL3Fylfxl matS0DA9DzzAHWPnuAfg4HFCSdO28wfSCcYmopWopq24OGxYJ3wwK2gJ6WTuMzW5SDQL1QXOJtpm YdVIk3Zsh/nVfxAse4iFIXPRkXZIWwuz2z6SbyXz7nnhkFiXlGxRs5cS1FGnhWFV1wAAAAAAAA== --Apple-Mail-54-629467208--