From dev-return-2331-apmail-couchdb-dev-archive=couchdb.apache.org@couchdb.apache.org Sat Feb 07 22:08:55 2009 Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 44248 invoked from network); 7 Feb 2009 22:08:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Feb 2009 22:08:55 -0000 Received: (qmail 46552 invoked by uid 500); 7 Feb 2009 22:08:50 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 46520 invoked by uid 500); 7 Feb 2009 22:08:50 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 46509 invoked by uid 99); 7 Feb 2009 22:08:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Feb 2009 14:08:50 -0800 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 216.86.168.178 is neither permitted nor denied by domain of geir@pobox.com) Received: from [216.86.168.178] (HELO mxout-03.mxes.net) (216.86.168.178) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Feb 2009 22:08:40 +0000 Received: from [10.0.1.194] (unknown [67.86.14.166]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id CEF6D23E3EB for ; Sat, 7 Feb 2009 17:08:14 -0500 (EST) Message-Id: <82C6FE57-1BBA-4E71-BDA4-0A1522608FEC@pobox.com> From: "Geir Magnusson Jr." To: dev@couchdb.apache.org In-Reply-To: <59473D22-6EA5-4CE2-A193-8F4938C2D6BB@apache.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: couchdb transactions changes Date: Sat, 7 Feb 2009 17:08:14 -0500 References: <84F66023-030A-4669-B75C-3DCC92D71A78@yahoo.com> <3B1EB33E-D224-43E2-9FDC-D7493CD5BFDD@pobox.com> <59473D22-6EA5-4CE2-A193-8F4938C2D6BB@apache.org> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On Feb 7, 2009, at 11:22 AM, Damien Katz wrote: > > On Feb 7, 2009, at 11:02 AM, Geir Magnusson Jr. wrote: > >> Thanks for the info. Is there a third mode possible? Namely all >> or nothing with conflict check, with the understanimg that the >> conflict guarantee is only at commit, and all bets are off after >> that when replicated? >> > > That's what we currently have. It's possible to keep supporting it, > but it doesn't work with any of CouchDB's distributed features. It's > only appropriate for a single node instance, even a hot standby > slave will have inconsistent states. Sure... Assuming we're defining things the same way, I think that the existing mode still might be useful - I could consider a node to be the "reference master" for my data (or a subset) and vector all writes there with whatever consistency promises I get from a single node, and then everyone else will be eventually consistent, and I'd know that the eventually consistent nodes have a transactionally consistent data set? I realize I may not attach the same meaning to concepts, but can you get a sense of what I'm saying? geir > > -Damien > >> >> >> On Feb 7, 2009, at 10:47 AM, Damien Katz >> wrote: >> >>> I'm working on a branch that implements couchdb the security >>> features with replication. It not done yet, but anyone is welcome >>> to look at the branch in /branches/rep_security. >>> >>> In this patch I am attempting to implement new transactions >>> models. The old transaction model has you all or nothing commits >>> for a group of docs, along with conflict checking. If any document >>> was in conflict, the transaction as a whole doesn't save. >>> >>> The problems with this are: >>> 1. Transactions don't work with replication. Replication doesn't >>> repeat the bulk single transaction, it just copies the documents >>> individually to the target replica. This means any downstream >>> replica can and will sees inconsistent states until replication >>> fully completes, not "all or nothing" states. With bidirectional >>> replication is even worse, as you can get edit conflicts that must >>> be resolved by an external process, . >>> 2. Transactions don't work in a partitioned database without a >>> huge performance hit (locking + 2 phase commits). >>> >>> So I propose supporting 2 different transaction models: >>> >>> This first is to support "All or nothing commits", but without >>> guaranteed conflict checking. So you can save bunch of documents >>> to the database and be sure they are all safely stored, or none >>> are safely stored, but you can't be guarantee you don't have any >>> conflicts when you do. >>> >>> The second is support non-acid bulk transactions, where some >>> document fail and some succeed. If the db crashes in the middle of >>> the transaction, some documents may have made it to disk >>> (completely intact), while others have not. The client will need >>> to check to be sure. >>> >>> With these 2 transactions models, it's possible to deploy the same >>> apps on a single machine or a huge partitioned cluster. To >>> support the current model, it's only possible to deploy apps on a >>> single machine. I propose we drop the current model as bulk >>> transactions are not supportable in clustered or replicated set ups. >>> >>> -Damien >>> >