Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 319 invoked from network); 6 Feb 2009 05:13:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Feb 2009 05:13:49 -0000 Received: (qmail 44072 invoked by uid 500); 6 Feb 2009 05:13:48 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 44041 invoked by uid 500); 6 Feb 2009 05:13:48 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 44030 invoked by uid 99); 6 Feb 2009 05:13:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Feb 2009 21:13:48 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.198.224 as permitted sender) Received: from [209.85.198.224] (HELO rv-out-0506.google.com) (209.85.198.224) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Feb 2009 05:13:41 +0000 Received: by rv-out-0506.google.com with SMTP id g37so594050rvb.35 for ; Thu, 05 Feb 2009 21:13:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=ISgpuPJerGEQfTHJ4d+ump8pMGU7T/6Gu4nEbbpBv0c=; b=KQrEsAwQdp/gbf/BvTUX1jdVsvlMui1DrpyKRVjvxgmyggcwUVGiBFKOfpXq3XNDLa 03GFboTOJPM4PO/Qr2MulEiEjrTlaw9MP8VM+RS78J71X246/iVS//oVZh/6+TajpsXu FFff3vzo5RwHcCmC1jZmMo5aFf454OZrky2ZI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=INor57sWsbeN7pCMOecXhuWpXFRX9iuZNRXh7gSnWUJ7GvMj/MDas7u8M71tEsA6fq oxHH5fGAWNcLALIFwrn2Ifymd09vOSz4fdlbSTYHCaED8S5vN1N6EhC6WZxbKCoLp1RP kFN+uOVolfMuC130gHf3j/Dyr5YoGgen/HsUQ= MIME-Version: 1.0 Received: by 10.141.114.15 with SMTP id r15mr998661rvm.42.1233897201370; Thu, 05 Feb 2009 21:13:21 -0800 (PST) In-Reply-To: References: <11E11144-004D-45B8-A503-88FD471953D7@apache.org> <9C8B5F07-856F-495D-AD91-FCA5AB5E31FF@pobox.com> <4E507D2E-88F9-4591-B721-F4343ACA9A9E@apache.org> <393666B7-8444-4D23-A2BA-AD59652A96AE@sauria.com> <0D17D25F-7E88-4F19-96A9-62FC81E2DFC5@pobox.com> Date: Fri, 6 Feb 2009 00:13:21 -0500 Message-ID: Subject: Re: Transactional _bulk_docs From: Paul Davis To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On Thu, Feb 5, 2009 at 10:02 PM, Antony Blakey wrote: > > On 06/02/2009, at 6:20 AM, Chris Anderson wrote: > >> Antony, maybe it would help for you to explain just exactly what you >> wouldn't be able to do, without the bulk docs API. It will help to >> inform people about the technical issue. > > > My original email included this: > > ------------------------------------------------------- > > For example, I have documents that can be cloned. The cloned document > contains a reference to the originating document. Then I delete the original > document, the clone history needs to be updated to remove the reference to > the original document and replace it with an original-deleted history item. > There is a business case that requires this consistency. > > With a transactional API this is easy. Without it, I can't see a way to > maintain consistency in the face of concurrent application access and/or > failure. > > ------------------------------------------------------- > > However, I don't think this is really about a specific example. > > The problem is that if you get one side of the relationship written and > visible, but the other side not, then other concurrent accessors will see a > partially successful update. > > One response is "but you'll see this problem during replication", but I > think this is making a big assumption about how replication is > managed/interleaved with local application behaviour. > > Replication, and dealing with conflicts, is in no way automatic. As others > have stated, there is no domain-independent way of resolving conflicts. > Surely if it were possible to build a transactional API on top of a > conflict-based system, then this statement would not be true? > > I am deploying CouchDB like a Notes CLIENT. Not as a high-performance > database server. Replication is an explicit operation, that halts normal > activity. For my first delivery, replicas are read-only, so replication > conflict isn't possible, but when I move to a distributed writers scenario, > resolving replication conflicts will involve a specialized UI, that allows > all conflicts to be resolved before normal operation resumes. Thus the > editing application always sees a conflict-free database. > > The use-case of someone doing a local operation e.g. submitting a web form, > is very different than resolving replication conflicts. Conflict during a > local operation is a matter of application concurrency, whereas conflict > during replication is driven by the overall system model. It has different > temporal, administrative and UI boundaries. > > In short, I think it is a mistake to try and hide the different > characteristics of local (even clustered) operations, and replication. You > may disagree, but if the system distinguishes between these two > fundamentally different things (distinguished by their partition-tolerance), > you can code as though every operation leads to conflict if you wish, but I > can't take advantage of the difference. > >> I know that the long-standing vision of Couch doesn't include special >> API exceptions for when you are running on a single node. And I'm a >> little afraid that the transactional doc commits Antony wants us to >> keep, are only a mirage, which would lead to trouble anyway, when >> distributed systems are involved. > > I don't understand why this needs to be the case. You can do transactions in > distributed systems. Do you have a model that isn't amenable to a Scalaris > treatment? Especially given that we're only talking about transactions over > a set of processes that are providing an illusion of a single system. Such a > cluster already requires some degree of partion-tolerance, right? And if > not, then what distinguishes a cluster from a partition-tolerant p2p mesh? > > Antony Blakey > ------------- > CTO, Linkuistics Pty Ltd > Ph: 0438 840 787 > > The fact that an opinion has been widely held is no evidence whatever that > it is not utterly absurd. > -- Bertrand Russell > > > I'm upset that CouchDB doesn't make me coffee in the morning. But the thing is, CouchDB is totally willing to make you coffee *and* bacon. It loves you *that* much. Enough with the silly. I've watched this drama avalanche for awhile and I finally think it's time for me to put out a few words on what I've seen. A brief history: 1. The mythical IRC conversation on 'removing' the feature: (roughly quoted) Damien: I don't think we can support transactional commits in the face of multiple nodes. We can do ACID writes to disk so that updates aren't lost, but checking with an unbounded number of servers that a commit doesn't conflict isn't feasible. Everyone else: That's pretty reasonable. 2. A patch was applied to trunk that made commits to CouchDB optionally ACID compliant (which gives users the traditional speed/safety choice) as well as removing the atomic 'all or none' semantics. 3. Huge ML threads. History complete. Current status (through my eyes): Near as I can tell Damien has been nose to the grindstone for quite some time on this very specific part of the api. Would I like more status updates and ideas on where he's heading? Of course. Do I trust him? Yes. Is the community as a whole going to blindly accept some asinine patch that has no value that removes a crap load of functionality? No. Controversy! I tend to think that the 'discussion' that everyone keeps referring to hasn't even occurred yet. I look at the patch that was applied that caused this as an unfortunate early release. What?! Admissions first: I have no money riding on this issue. Whether or not CouchDB has transactional _bulk_docs worries me not at all. Though, I can't say that I have that much sympathy for a business model that relies on an open source project's trunk to remain compatible with required assumptions. Break: People seem to think that this conversation is over and done with. It isn't. This is a part of the API that's under work and will change. Reductio ad absurdum: Do we require a mailing list thread for every character changed in the source? HTH, Paul Davis