Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 22704 invoked from network); 20 Oct 2008 21:32:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Oct 2008 21:32:14 -0000 Received: (qmail 46053 invoked by uid 500); 20 Oct 2008 21:32:15 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 45903 invoked by uid 500); 20 Oct 2008 21:32:14 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 45886 invoked by uid 99); 20 Oct 2008 21:32:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Oct 2008 14:32:14 -0700 X-ASF-Spam-Status: No, hits=3.2 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [72.14.220.159] (HELO fg-out-1718.google.com) (72.14.220.159) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Oct 2008 21:31:02 +0000 Received: by fg-out-1718.google.com with SMTP id l26so1497373fgb.26 for ; Mon, 20 Oct 2008 14:31:30 -0700 (PDT) Received: by 10.187.176.2 with SMTP id d2mr1204275fap.55.1224538290386; Mon, 20 Oct 2008 14:31:30 -0700 (PDT) Received: by 10.187.238.13 with HTTP; Mon, 20 Oct 2008 14:31:30 -0700 (PDT) Message-ID: Date: Mon, 20 Oct 2008 22:31:30 +0100 From: "Nick Johnson" To: couchdb-user@incubator.apache.org Subject: Re: uuid, auto-increment, et al. Cc: "Antony Blakey" In-Reply-To: <28C5C039-4B15-4160-A4EF-6E834BA439E4@gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_92077_20454610.1224538290380" References: <3CA3C700-106B-4577-9ABD-BDD4B422FE14@gmail.com> <28C5C039-4B15-4160-A4EF-6E834BA439E4@gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_92077_20454610.1224538290380 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On Mon, Oct 20, 2008 at 5:10 PM, ara.t.howard wrote: > > On Oct 19, 2008, at 11:34 PM, Antony Blakey wrote: > > 3. Use couchdb in a single-writer multiple-reader scenario. If you only do >> that for those activities that require uniqueness then you have consistency >> issues to deal with because replication is asynchronous. One way to do that >> is to switch a session to the writable server as soon as you need >> uniqueness. The single writer becomes a bottleneck, but this is what I'm >> doing because it matches my information architecture. >> >> 4. Use a central specialized server to check uniqueness and generate an >> opaque userid token that you would subsequently use as a key (you shouldn't >> use the username as a key). An ldap server or something like it. Equivalent >> to the option above, but the single server only needs to deal with the >> particular operations requiring uniqueness. It's still a single point of >> failure, but I don't think you can get around that if you want synchronous >> global uniqueness testing. >> > > > > you know, thinking about this further is causing me to wonder if even > that'd do it. consider a single write master an attempting to check if a > given login is unique, with couch there no way to ensure that two keys are > unique other than using said key as the _id for the doc. it's possible > that'd work in some cases, but consider > > user > login : string (unique) > password : string > email: string (unique) > > and it suddenly becomes difficult again - we cannot ensure both are unique > with an atomic test/set. If you have 'user' docs and 'email' docs (which may simply contain nothing but a reference to the user that owns the email), you can attempt to insert both a new user doc and a new email doc in a single bulk write transaction. The transaction will fail if either one already exists. This doesn't really work with my proposed sharding method, however, since you can't guarantee both entities will hash to the same shard. You could insert one doc then the other, and if the second insert fails, roll back the first one, though. > we cannot simply combine them as a key because > > login: foo > email: foo@barcom > > and > > login: bar > emai: foo@bar.com > > would become a unique pair. rather we'd have to do something like keep a > master index of all docs - or perhaps just a blank doc (index) mapping > emails to users and bulk update the pair of them together. > > so there are probably solutions lurking, but all of them seem very > complicated to allow something as simple as allowing a user to have a simple > piece of memorable information to enter and have an application respond by > mapping that directly to the user's data. > > obviously we can solve the issue with multiple writes and reads combined, > but that's not particularly good if other applications might be using this > data actively (which is of course the case with a single writer). > > to summarize the problem - what's the preferred method of ensuring that a > field of a couchdb doc is unique across all docs if that key cannot be the > _id itself? is the only method to write it, process a view to find all docs > which might have a duplicate field, and then possibly to delete the > original? it's all i imagine now but that's pretty tricky in a high > concurrency situation because of the mutually destructive race condition > what proceeds: > > a - write doc with foo=bar > b - write doc with foo=bar > > a - read list of all docs where foo==bar > b - read list of all docs where foo==bar > > a - decide ours was a dup. destroy it. report error to user > b - decode ours was a dup. destroy it. report error to user. > > which is full of messiness if the network or db goes down but, most > importantly, can fail even when it's up. > > what am i missing when it comes to attempting to model a doc field which > should contain a unique value? > > cheers. > > > a @ http://codeforpeople.com/ > -- > we can deny everything, except that we have the possibility of being > better. simply reflect on that. > h.h. the 14th dalai lama > > > > ------=_Part_92077_20454610.1224538290380--