Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 2622 invoked from network); 19 Mar 2009 05:50:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Mar 2009 05:50:57 -0000 Received: (qmail 23136 invoked by uid 500); 19 Mar 2009 05:50:56 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 23097 invoked by uid 500); 19 Mar 2009 05:50:56 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 23086 invoked by uid 99); 19 Mar 2009 05:50:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Mar 2009 22:50:56 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of david.vancouvering@gmail.com designates 209.85.132.240 as permitted sender) Received: from [209.85.132.240] (HELO an-out-0708.google.com) (209.85.132.240) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Mar 2009 05:50:46 +0000 Received: by an-out-0708.google.com with SMTP id b2so257872ana.5 for ; Wed, 18 Mar 2009 22:50:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=EMTN2sIO7NzBz/4FgSQ5K3+Ml95N7yR3hoY1Wb206QM=; b=IVNsJUewd7JHY3NSrQ51GcRC7vpGEROcxafEQV6uyaBDSTOcC8pOqo9/EwHl09GXGw Ld/PNwrSSv4WZO7+DsfQlABkQIcyU15wKtaSPHKFk2y5txoxpEa5RpFoJapyJSAzDCmn /WHDWcNsg+su/j1WG4BlG1MoKuQalBsgRndxI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; b=J1Cd74e1wTFauBLMQ5fF43g8rTZkRODiIf5C1H2qJk+hbvdmI9j6UKR3VOg7krVml0 FdFJvWSnMkq3bZoKrY71UqawIVFeIH55zpTO/GzRfz5dmjvjvX7l8igkscO1XuRbeboo TajYBhAD+e9SfPuTDytnHZk22Cs5uY/i8IirE= MIME-Version: 1.0 Sender: david.vancouvering@gmail.com Received: by 10.231.19.204 with SMTP id c12mr427862ibb.39.1237441824733; Wed, 18 Mar 2009 22:50:24 -0700 (PDT) Date: Wed, 18 Mar 2009 22:50:24 -0700 X-Google-Sender-Auth: ca54c53eaff7cb77 Message-ID: <56a83cd00903182250l5aaec9c2g8815899f32e6924a@mail.gmail.com> Subject: Bulk updates and eventual consistency From: David Van Couvering To: dev@couchdb.apache.org Content-Type: multipart/alternative; boundary=00221532cf6c64b95a04657260d6 X-Virus-Checked: Checked by ClamAV on apache.org --00221532cf6c64b95a04657260d6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi, all. I'm working on updating the Wiki to describe the new behavior of bulk updates. I read the very (very) long thread about Damien's change to the transactional semantics around _bulk_docs, and I understand the situation pretty well (I think). But there's one part of the discussion that I wanted to make sure I had correct. My understanding is that one motivation for bulk update may be because you have referential dependencies between docs. If there are no conflicts, then you can be assured those references will be consistent on the database where you do the bulk update (with all-or-nothing), *but* they may not immediately be consistent on replicas. This is because a bulk update is not replicated all-or-nothing, but instead each document is replicated independently, in an unspecified order. So you will have a temporary state of affairs where the references between documents may be inconsistent, but eventually they do become consistent (for that particular bulk update). *But* if you *do* have conflicts in a bulk update, then it is quite possible that the choice of winners for the conflict will cause a referential inconsistency between documents. In this case, the inconsistency will *not* automatically become eventually consistent, but will require intervention by the application to resolve the documents to a consistent state. This can happen even when you are not using replication at all, but you have two simultaneous sessions update the same document. In the previous implementation, bulk_update rolled back if there were any "local" conflicts, so you were guaranteed of referential consistency between docs on the database instance where you applied the bulk update. However, you could still end up in a pickle if replication caused a conflict -- now you are back in the same place with referential inconsistency that has to be manually resolved. Do I have that right? I am uncomfortable about asking the next question, as I feel I am opening up a can of worms, but I am missing what problem was solved by allowing all-or-nothing to succeed on conflicts. It seems like in both models you have eventual consistency and interim states where documents are inconsistent, but at least with the old approach you were guaranteed consistency on the database instance where you did the bulk update. That seems like it could be pretty handy, particularly for deployments where you are not doing replication. My apologies if this was already answered in that very long thread, but perhaps someone can summarize for me... Thanks, David -- David W. Van Couvering I am looking for a senior position working on server-side Java systems. Feel free to contact me if you know of any opportunities. http://www.linkedin.com/in/davidvc http://davidvancouvering.blogspot.com http://twitter.com/dcouvering --00221532cf6c64b95a04657260d6--