Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 42994 invoked from network); 3 Feb 2010 14:24:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Feb 2010 14:24:02 -0000 Received: (qmail 42048 invoked by uid 500); 3 Feb 2010 14:24:02 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 41981 invoked by uid 500); 3 Feb 2010 14:24:02 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 41970 invoked by uid 99); 3 Feb 2010 14:24:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Feb 2010 14:24:01 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=NORMAL_HTTP_TO_IP,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of b.candler@pobox.com designates 208.72.237.25 as permitted sender) Received: from [208.72.237.25] (HELO sasl.smtp.pobox.com) (208.72.237.25) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Feb 2010 14:23:50 +0000 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 4BABF95BFA for ; Wed, 3 Feb 2010 09:23:27 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date:from:to :subject:message-id:references:mime-version:content-type :in-reply-to; s=sasl; bh=LaCGM13iE3WJLdRynMBEvd1G6JA=; b=YpypKW+ rpPaSWBly1iysJIYToHkLVscNHTtcs+DLZQDKV1aS47b56lP9z4Ane+y2nPM+P1z kD+0p++zA0P85tH7TKeoc7lamIkDmP/5BShy10u+Kwjep3uOIsdq4PWlMFTc/qTG ZpjW82IGuKMufjLxu82k/NPDu2u3fgQgOlgQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:from:to :subject:message-id:references:mime-version:content-type :in-reply-to; q=dns; s=sasl; b=HlKVDSr7eemWDuiXJ1sUzH+e+tY05csHJ zs6ogjTpexdGdqr5mCZXE/u1hIaQ7icioB42yUzpTo7lRNryGXQQqmd+ivuewzLJ NJknTdykZ12PnruLtDqoaudEH8Cie4XuGVRluyakpuJ6cjl768OpYcrVzByTgvn2 ntMUehCrWU= Received: from a-pb-sasl-quonix. (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 4635E95BF9 for ; Wed, 3 Feb 2010 09:23:27 -0500 (EST) Received: from zino (unknown [87.194.77.98]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id 08AC195BF8 for ; Wed, 3 Feb 2010 09:23:26 -0500 (EST) Received: from lists by zino with local (Exim 4.69) (envelope-from ) id 1Ncg8b-0002Lx-Ie for dev@couchdb.apache.org; Wed, 03 Feb 2010 14:23:25 +0000 Date: Wed, 3 Feb 2010 14:23:25 +0000 From: Brian Candler To: dev@couchdb.apache.org Subject: Re: 0.11 Release / Feature Freeze for 1.0 Message-ID: <20100203142325.GA9028@uk.tiscali.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) X-Pobox-Relay-ID: B1DABCD2-10CF-11DF-B389-6AF7ED7EF46B-28021239!a-pb-sasl-quonix.pobox.com X-Virus-Checked: Checked by ClamAV on apache.org I see the readeracl branch was recently merged into trunk, and I've just been testing it again. My concern is that the design is flawed, and that if this goes into 0.11 then we are stuck with it forever; so it's better to address these sooner rather than later. I do see logic in keeping the admin/reader authorizations for a database within the database itself. The problems are: (1) _all_dbs currently shows everything - even those databases you don't have access to. This leads to the following issues: * Users won't want other users to be able to see the names of their databases, for privacy reasons. (Imagine what would happen if github revealed the names of all the private repos on it, even if you couldn't access the contents) * I don't want users to know many databases I have on my box, for commercial reasons. (Ditto for github). * I don't want users to have to page through loads of databases they can't access * Futon constantly pops up errors about "Database information could not be retrieved" (although that one's easily fixable) I believe that in its current form, _all_dbs simply won't scale to millions of databases on a box if you want to limit it to accessible dbs only. (2) _readers is a single monolithic object. I believe that it won't scale to millions of users having access to the same database. (3) _readers has no concurrency control. One admin making an ACL change in futon (say) will silently overwrite changes made around the same time by another admin. This will get worse the more frequently users are added and removed. For me, those are serious problems. I sketched a design for an alternative approach, using the _user db to hold the authorizations in terms of database-specific roles. Unfortunately I didn't have time to contribute an implementation of this. If there's a chance this alternative approach would be used then I will try to steal the time from somewhere. The ideas behind it weren't explicitly rejected, but neither were they acknowledged as a good approach. If the current design stays, then I think there will be sticking-plaster solutions forever; e.g. proxies to fake out _all_dbs and ACL changes, mapping them to a 'real' database behind. Right, I've said all that. Now I have a few further observations from the current implementation. (4) An "admin" is not a "reader", and this is clearly intentional from comments in the code. However, someone who has an "admin" role without "reader" role is unable to perform ACL changes, which for me defeats the whole purpose of the "admin" role. Example: user "brianadmin" is in "_admins" on database "briantest", but not in "_readers": $ curl http://admin:admin@127.0.0.1:5984/briantest/_admins {"names":["brianadmin"],"roles":[]} $ curl http://admin:admin@127.0.0.1:5984/briantest/_readers {"names":["brian"],"roles":[]} But when "brianadmin" tries to update an ACL, here's what happens: $ curl http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_readers {"error":"unauthorized","reason":"You are not authorized to access this db."} $ curl -X PUT -d '{"names":["foo","brian"],"roles":[]}' http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_readers {"error":"unauthorized","reason":"You are not authorized to access this db."} Even if this were fixed so that a db admin had access to _readers and _admins resources (and design docs), I think that in practice a database administrator would be expected to have access to the database she is administering. In that case, adding the same user to both "admin" and "reader" roles simply involves duplicating data, as well as having to remember to remove that user from two places when she leaves. That introduces more scope for error. So I'd propose that the relaxed approach is that a database "admin" should inherit "reader" rights. Isn't that true for a server-level admin anyway? (5) Non-admin readers can view the entire _readers, _admins and _security resources. I think this is quite a severe privacy concern, but it is easily fixed. (6) Databases are created world-readable by default, which means a race to get the _readers set before someone else starts inserting documents. I think a PUT /dbname option to set a non-empty readers list would be a good idea (and a corresponding checkbox in futon) (7) Couchdb accepts nonsense _readers documents, e.g. $ curl -X PUT -d '{"names":{"foo":"bar"},"roles":456}' http://admin:admin@127.0.0.1:5984/briantest/_readers {"ok":true} The effect is to reset the _readers document to its permit-all default, thus opening up the database to the world. $ curl http://127.0.0.1:5984/briantest/_readers {"names":[],"roles":[]} (8) Point (7) is arguably a simple bug which can be fixed, but I'd prefer for couchdb to be fail-safe; that is, an empty ACL means nobody has access. One way to achieve this would be for two new roles, "_anon" and "_user", granted to all unauthenticated and authenticated users respectively. Then a fully public database would have roles:["_anon","_user"], and this would be added to a new database unless you ask otherwise (see point (6)). (9) The _users db itself is world-readable (showing not only who your users are, but their password hashes). Highly undesirable. You can set a _readers ACL on it, but it has consequences: * users can't sign up for new accounts * users can't change their own passwords This forces such things through a privileged external interface. Actually I'm fine with that, because I want to validate signups anyway, but others might not be. (10) I don't think you can replicate _readers, _admins and _security, unless (a) you are doing rsync filesystem-level replication, or (b) you explicitly GET and PUT these resources from one DB to another. This is arguably a feature not a bug. (11) Trivial problem of the day: _security resources which are not objects give an erlang-flavoured error. $ curl -X PUT -d '["foo","bar"]' http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_security {"error":"unknown_error","reason":"function_clause"} Sorry for the long post, and if I really am barking up the wrong tree here, please tell me so. Regards, Brian.