Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 69FE17A88 for ; Wed, 17 Aug 2011 02:49:53 +0000 (UTC) Received: (qmail 92803 invoked by uid 500); 17 Aug 2011 02:49:51 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 92659 invoked by uid 500); 17 Aug 2011 02:49:50 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 92651 invoked by uid 99); 17 Aug 2011 02:49:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Aug 2011 02:49:49 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of adam.kocoloski@gmail.com designates 209.85.161.180 as permitted sender) Received: from [209.85.161.180] (HELO mail-gx0-f180.google.com) (209.85.161.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Aug 2011 02:49:40 +0000 Received: by gxk10 with SMTP id 10so654018gxk.11 for ; Tue, 16 Aug 2011 19:49:19 -0700 (PDT) Received: by 10.236.116.33 with SMTP id f21mr1520017yhh.115.1313549359296; Tue, 16 Aug 2011 19:49:19 -0700 (PDT) Received: from [192.168.1.7] (c-76-119-89-178.hsd1.ma.comcast.net [76.119.89.178]) by mx.google.com with ESMTPS id e21sm810801yhn.35.2011.08.16.19.49.17 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 16 Aug 2011 19:49:18 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: The replicator needs a superuser mode From: Adam Kocoloski In-Reply-To: Date: Tue, 16 Aug 2011 22:49:16 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <54390D26-2473-4B42-A2F2-EBBC7ED91D5A@apache.org> References: To: dev@couchdb.apache.org X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org On Aug 16, 2011, at 10:31 PM, Jason Smith wrote: > On Tue, Aug 16, 2011 at 9:26 PM, Adam Kocoloski = wrote: >> One of the principal uses of the replicator is to "make this database = look like that one". We're unable to do that in the general case today = because of the combination of validation functions and out-of-order = document transfers. It's entirely possible for a document to be saved = in the source DB prior to the installation of a ddoc containing a = validation function that would have rejected the document, for the = replicator to install the ddoc in the target DB before replicating the = other document, and for the other document to then be rejected by the = target DB. >=20 > Somebody asked about this on Stack Overflow. It was a very simple but > challenging question, but now I can't find it. Basically, he made your > point above. >=20 > Aren't you identifying two problems, though? >=20 > 1. Sometimes you need to ignore validation to just make a nice, clean = copy. > 2. Replication batches (an optimization) are disobeying the change > sequence, which can screw up the replica. As far as I know the only reason one needs to ignore validation to make = a nice clean copy is because the replicator does not guarantee the = updates are applied on the target in the order they were received on the = source. It's all one issue to me. > I responded to #1 already. >=20 > But my feeling about #2 is that the optimization goes too far. > replication batches should always have boundaries immediately before > and after design documents. In other words, batch all you want, but > design documents [1] must always be in a batch size of 1. That will > retain the semantics. >=20 > [1] Actually, the only ddocs needing their own private batches are > those with a validate_doc_update field. My standard retort to transaction boundaries is that there is no global = ordering of events in a distributed system. A clustered CouchDB can try = to build a vector clock out of the change sequences of the individual = servers and stick to that merged sequence during replication, but even = then the ddoc entry in the feed could be "concurrent" with several other = updates. I rather like that the replicator aggressively mixes up the = ordering of updates because it prevents us from making choices in the = single-server case that aren't sensible in a cluster. By the way, I don't consider this line of discussion presumptuous in the = least. Cheers, Adam