Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 8475 invoked from network); 1 Mar 2010 07:09:45 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Mar 2010 07:09:45 -0000 Received: (qmail 13937 invoked by uid 500); 1 Mar 2010 07:02:25 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 13913 invoked by uid 500); 1 Mar 2010 07:02:25 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 13904 invoked by uid 99); 1 Mar 2010 07:02:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Mar 2010 07:02:25 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=FS_REPLICA,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [128.200.36.30] (HELO translab.its.uci.edu) (128.200.36.30) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Mar 2010 07:02:17 +0000 Received: from translab.its.uci.edu (localhost.localdomain [127.0.0.1]) by translab.its.uci.edu (8.13.1/8.12.10) with ESMTP id o2171p1C021188 for ; Sun, 28 Feb 2010 23:01:51 -0800 Received: (from jmarca@localhost) by translab.its.uci.edu (8.13.1/8.13.1/Submit) id o2171pa4021187 for user@couchdb.apache.org; Sun, 28 Feb 2010 23:01:51 -0800 Date: Sun, 28 Feb 2010 23:01:51 -0800 From: James Marca To: user@couchdb.apache.org Subject: Re: Replication question Message-ID: <20100301070151.GA20602@translab.its.uci.edu> Mail-Followup-To: user@couchdb.apache.org References: <20100228050222.GC7617@translab.its.uci.edu> <20100228091558.GA5151@uk.tiscali.com> <92356a131002281329w73ab7d33k31cb94f82fc4d4a9@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <92356a131002281329w73ab7d33k31cb94f82fc4d4a9@mail.gmail.com> User-Agent: Mutt/1.4.1i X-ITS-MailScanner: Found to be clean X-ITS-MailScanner-From: jmarca@translab.its.uci.edu X-ITS-Spam-Status: No X-Virus-Checked: Checked by ClamAV on apache.org On Mon, Mar 01, 2010 at 10:29:03AM +1300, Blair Nilsson wrote: > It shouldn't be surprising though, the target database may already > have records in it that would change the results, which would be > difficult to detect without running the map on all the data that was > already there. Also it is quite likely that it would take longer to > replicate all the view data then regenerate it. Hell, you may never > use that view on the replicated end so transferring the processed data > is a waste anyway. > Okay, but I still think it is a bug. Aside from specific document conflicts, the rules for views are that identical input equals identical output. So the documents that replicate successfully from one db to the other should produce identical output from identical view code. I don't know much about b-trees, but I suspect there are algorithms to merge two b-trees efficiently. If that is true, then if the view is already computed then isn't the laziest response just to copy it over and merge it with the current view, even if you have to somehow caveat the replication conflicts. CouchDB seems intelligent enough in the view generation to notice when docs have changed and only compute views on those docs, so why can't similar code get thrown at this? As to whether or not copying the views is useful or not, I think it is application-specific. I've got a couple terabytes of data waiting in the pipe to get processed this way, so actually, in my use case, re-running the view is out of the question, and re-using views is the height of efficiency. And finally, I've only got two views (two design documents) and I'm certainly going to be using them! Regards, James Marca -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.