Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A538F7849 for ; Wed, 17 Aug 2011 03:16:37 +0000 (UTC) Received: (qmail 14450 invoked by uid 500); 17 Aug 2011 03:16:36 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 13982 invoked by uid 500); 17 Aug 2011 03:16:24 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 13965 invoked by uid 99); 17 Aug 2011 03:16:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Aug 2011 03:16:21 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.210.174] (HELO mail-iy0-f174.google.com) (209.85.210.174) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Aug 2011 03:16:13 +0000 Received: by iyf40 with SMTP id 40so1274682iyf.5 for ; Tue, 16 Aug 2011 20:15:52 -0700 (PDT) Received: by 10.42.163.9 with SMTP id a9mr370498icy.345.1313550952099; Tue, 16 Aug 2011 20:15:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.166.5 with HTTP; Tue, 16 Aug 2011 20:15:32 -0700 (PDT) In-Reply-To: <54390D26-2473-4B42-A2F2-EBBC7ED91D5A@apache.org> References: <54390D26-2473-4B42-A2F2-EBBC7ED91D5A@apache.org> From: Jason Smith Date: Wed, 17 Aug 2011 10:15:32 +0700 Message-ID: Subject: Re: The replicator needs a superuser mode To: dev@couchdb.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Wed, Aug 17, 2011 at 9:49 AM, Adam Kocoloski wrote= : > On Aug 16, 2011, at 10:31 PM, Jason Smith wrote: > >> On Tue, Aug 16, 2011 at 9:26 PM, Adam Kocoloski wr= ote: >>> One of the principal uses of the replicator is to "make this database l= ook like that one". =C2=A0We're unable to do that in the general case today= because of the combination of validation functions and out-of-order docume= nt transfers. =C2=A0It's entirely possible for a document to be saved in th= e source DB prior to the installation of a ddoc containing a validation fun= ction that would have rejected the document, for the replicator to install = the ddoc in the target DB before replicating the other document, and for th= e other document to then be rejected by the target DB. >> >> Somebody asked about this on Stack Overflow. It was a very simple but >> challenging question, but now I can't find it. Basically, he made your >> point above. >> >> Aren't you identifying two problems, though? >> >> 1. Sometimes you need to ignore validation to just make a nice, clean co= py. >> 2. Replication batches (an optimization) are disobeying the change >> sequence, which can screw up the replica. > > As far as I know the only reason one needs to ignore validation to make a= nice clean copy is because the replicator does not guarantee the updates a= re applied on the target in the order they were received on the source. =C2= =A0It's all one issue to me. > >> I responded to #1 already. >> >> But my feeling about #2 is that the optimization goes too far. >> replication batches should always have boundaries immediately before >> and after design documents. In other words, batch all you want, but >> design documents [1] must always be in a batch size of 1. That will >> retain the semantics. >> >> [1] Actually, the only ddocs needing their own private batches are >> those with a validate_doc_update field. > > My standard retort to transaction boundaries is that there is no global o= rdering of events in a distributed system. =C2=A0A clustered CouchDB can tr= y to build a vector clock out of the change sequences of the individual ser= vers and stick to that merged sequence during replication, but even then th= e ddoc entry in the feed could be "concurrent" with several other updates. = =C2=A0I rather like that the replicator aggressively mixes up the ordering = of updates because it prevents us from making choices in the single-server = case that aren't sensible in a cluster. That is interesting. So if it is crucial that an application enforce transaction semantics, then that application can go ahead and understand the distribution architecture, and it can confirm that a ddoc is committed and distributed among all nodes, and then it can make subsequent changes or replications. Or, written as a dialogue: Developer: My application knows or cares that Couch is distributed. Developer: My application depends on a validation function applying univers= ally. Developer. But my application won't bother to confirm that it's been fully pushed before I make changes or replications. Adam: WTF? Snark aside, it's an excellent point. Thanks. --=20 Iris Couch