Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 94360 invoked from network); 4 Feb 2010 03:22:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Feb 2010 03:22:42 -0000 Received: (qmail 19621 invoked by uid 500); 4 Feb 2010 03:22:42 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 19443 invoked by uid 500); 4 Feb 2010 03:22:41 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 19397 invoked by uid 99); 4 Feb 2010 03:22:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Feb 2010 03:22:40 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [80.244.253.218] (HELO mail.g3th.net) (80.244.253.218) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Feb 2010 03:22:31 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.g3th.net (Postfix) with ESMTP id 1EEC16C4A3C for ; Thu, 4 Feb 2010 04:22:09 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mail.g3th.net Received: from unknown by localhost (amavisd-new, unix socket) id xm12xO-jDDOo for ; Thu, 4 Feb 2010 04:22:07 +0100 (CET) Received: from [172.16.1.138] (VIP-Place-Properties-1157395.cust-rtr.pacbell.net [69.239.253.34]) (authenticated) by mail.g3th.net (amavisd-milter) (authenticated as web50m1); Thu, 4 Feb 2010 04:21:57 +0100 (CET) (envelope-from ) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Apple Message framework v1077) Subject: Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0) From: Jan Lehnardt In-Reply-To: <015a01caa529$3d24e230$b76ea690$@com> Date: Wed, 3 Feb 2010 19:21:53 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <2C591A9F-55E4-49DD-A3E3-9BA075EAE633@apache.org> References: <20100203212426.GA10515@uk.tiscali.com> <015a01caa529$3d24e230$b76ea690$@com> To: dev@couchdb.apache.org X-Mailer: Apple Mail (2.1077) X-Virus-Checked: Checked by ClamAV on apache.org Hi James, thanks for your thoughts. I do agree with most points. But I'd like to propose a pragmatic way out. I think Chris' auth design is pretty solid. We have been thinking about this space for over two years now and this is first thing that makes me happy. Chris' auth design is also not "complete" as in you can have a public CouchDB running that serves all your security needs. I don't think we should try and pretend everything is as locked down as it is needed in every scenario, but get a limited set of scenarios right and expand from there. I think the current way is a good way forward. We can address more scenarios further down the line. If that means some, even many users have to keep combining CouchDB with a HTTP proxy as they have done in the past, I'm okay with that. If I thought Chris' design would be fundamentally flawed, I'd have a different position, though. We won't ever ship 1.0 if we want CouchDB to be "perfect". Cheers Jan -- On 3 Feb 2010, at 15:33, James Hayton wrote: > Hi Everyone- >=20 > I am just an end user of couch right now, but the development of these > security features are important to me so I thought I would share my > thoughts. =20 >=20 > In general and specifically regarding points 5 and 9, I have to agree = very > passionately with Brian. There is no way that I want my users having = access > to the email addresses, usernames, password hashes, etc... of any = other > users on my system. That would be a very bad thing and potentially = even > derail some sort of use cases I had thought of for couch in the = future. I > would also prefer couch to be closed by default and open as needed, = but this > doesn't matter too much as long as I can close it down. =20 >=20 > I also like the idea of restricting _all_dbs to return only the = databases > that the user has access to. I would accept it as an admin only = resource if > that=92s what everyone thought was best or the only option, but it = seems crazy > to allow users to see the names of databases that they don=92t have = access to. > There is no point imho. They don't have access for a reason. =20 >=20 > Lastly I would have to say that I would much rather have 0.11 and the > feature freeze for 1.0 held off for weeks or even months if required = to get > this security stuff worked out properly. It's incredibly important = before > very many real world applications with sensitive data can actually be = built. > I really believe couch is an amazing piece of software that could = create so > many new opportunities by utilizing it's unique features such as = replication > and hosted applications. But, the true potential of couch really = depends on > rock solid authorization capabilities. Without that, I don=92t think = couch is > truly ready for a 1.0 release. Proper authorization probably isn't an = easy > thing to solve and I understand the desire to punt on this stuff and = just > get something working, but I just don't think it=92s a good long term > strategy. =20 >=20 > I hope I have not come off as unappreciative or demanding. I truly > appreciate all the work that you guys are doing regarding security and = with > couch in general. I also understand that my opinion probably carries = very > little relative weight here since I am not a contributor, but I would = really > plead with everyone not to rush a release or freeze the api without = really > making sure that these issues have been thought through carefully, = tested, > etc... =20 >=20 > Thanks,=20 >=20 > James Hayton >=20 >=20 >=20 >=20 >=20 > -----Original Message----- > From: Brian Candler [mailto:B.Candler@pobox.com]=20 > Sent: Wednesday, February 03, 2010 1:24 PM > To: dev@couchdb.apache.org > Subject: Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0) >=20 > On Wed, Feb 03, 2010 at 09:21:10AM -0800, Chris Anderson wrote: >> Let me see if I can address some of these concerns. >=20 > Thank you for taking the time to reply in detail and to implement some = of > the changes. >=20 >>> I believe that in its current form, _all_dbs simply won't scale to > millions >>> of databases on a box if you want to limit it to accessible dbs = only. >>=20 >> This is an interesting one. _all_dbs won't scale indefinitely even >> before this patch, because it has no built-in pagination abilities. >> Enhancing this feature to look into each file and keep going to till >> it finds N that can be listed isn't hard to code. It will be a little >> more work to make _all_dbs respect startkey and endkey. >=20 > I agree that it's not hard to code. What I mean is that it won't scale = if > the server has to open and read a million files on disk to find the = two that > you have access to. >=20 > Making _all_dbs an admin-only resource as Jan proposed is brutal but > effective in protecting the server. (Admins probably do want _all_dbs > paginated, but that's a separate issue). Futon would of course have = to be > changed so that non-admin users type in the name of the database they = want > to access. >=20 > So far this hasn't answered the question: why not put the = authorization in > the _users document instead? But I think we're getting to that :-) >=20 >>> (2) _readers is a single monolithic object. I believe that it won't > scale to >>> millions of users having access to the same database. >>=20 >> It's not meant to support this use case. If you have millions of = users >> with the same access rights, give them a common role and give that >> role access to the database. >=20 > That doesn't scale either, because what couchdb calls "roles" are = really > what I'd call "groups". That is, they are a system-wide collection of = users. > They are only maintainable by a system-wide administrator. >=20 > What I'm thinking of is that database1 contains application1, with a > collection of users. database2 contains application2, with another > collection of users, and so on. These databases/applications are = hosted on > the same server, but belong to third parties. >=20 > IMO it's an unrealistic expectation for the database1 owner to come = along to > the system administrator and say: >=20 > 1. I'm having problems with scaling my _readers. > 2. Please create a new role I can use, and apply this to all my = existing > readers. > 3. More importantly, every time I need to add a new user to my = application, > I will come back to you and ask you to add this role that user. >=20 > Then for the database2 owner to come along and ask for the same. >=20 > That is, because roles are *system-wide* not *database-wide* then the > management of them doesn't scale if you want to use them for = database-level > access controls. >=20 > Given the above: as system administrator you could decide to create = roles > like "database1:_reader" to simplify administration and avoid role = name > clashes. You could even arrange the validate_doc_update in the _users > database so that a delegated person in database1 is able to add and = remove > database1:* roles without having to trouble the systemwide admin. >=20 > But that's exactly what my proposal was. In which case, why can we not = just > use this mechanism in the first place? >=20 >>> (3) _readers has no concurrency control. One admin making an ACL = change > in >>> futon (say) will silently overwrite changes made around the same = time by >>> another admin. This will get worse the more frequently users are = added > and >>> removed. >>=20 >> _readers / _admins / _security are stored as a raw object without >> concurrency control, because keeping them as a document adds too much >> performance overhead on each request. Concurrency control is a >> tradeoff we make here. >=20 > Sorry to be blunt, but do you have numbers to back that up? This = smells > very much of premature optimisation. >=20 > In any case: if db:_reader and db:_admin are just roles, you have them = in > the userctx object already. That's clearly *more* efficient than = having > them separately in the database. >=20 > _security is an edge case. I consider it as an adjunct to the design = doc.=20 > You could, after all, hardcode >=20 > var security =3D { .... }; >=20 > in the top of your validate_doc_update; it just avoids you having to = touch > your design doc so often. Since there's only a single _security = document > it's going to end up cached anyway. >=20 >> The database-specfic roles and names don't belong in the users db. = The >> users db is for answering the question: "who is the user and what >> roles do they have". The ACLs say which names and roles can read or >> admin a given database. >>=20 >> It's a fact of life that users can rsync db-files around. If the = names >> / roles are in the users db, they get wrong when databases are moved >> to another host or renamed on the current host. >=20 > The last sentence I agree with. The same is true if you delete a DB = and > recreate one with the same name. >=20 > However, database uuids were proposed recently. If the _users doc = authorized > against uuids rather than database names, would that issue be solved? > (The ability to have a per-user _all_dbs view would be lost, if there = wasn't > a fast way to map a uuid back to a database name, but we've already = decided > we can live without that) >=20 >> 4 is fixed. >=20 > Thanks. It didn't even add any data privacy, since an _admin could = always > add themselves as a _reader anyway. >=20 >>> (5) Non-admin readers can view the entire _readers, _admins and > _security >>> resources. I think this is quite a severe privacy concern, but it = is > easily >>> fixed. >>=20 >> They can also read the design document. I'm not sure why this is a >> privacy concern. A user may need to contact a db admin for help with >> something, it's handy to be able to get a list of them. And it only >> makes sense that you should be able to see the list of users who can >> also access the same db you can. >>=20 >> If there's consensus that this is indeed an issue, it's not a hard >> thing to change in the code. >=20 > I await what others say. However I would certainly *not* want the = internal > E-mail addresses of my admins being available to the whole world. And = as an > end-user of a facebook-style application, I would not want my E-mail = address > known to every other user anywhere on that database. >=20 > For comparison: if you're granted SELECT, INSERT, UPDATE and DELETE > privileges as an Oracle user, that does not mean you get to find out = the > usernames of the admins, or indeed any other users with rights to the = same > database. >=20 > And for comparison: if someone signs up for the dev@apache.org mailing = list, > they do not get to see the E-mail addresses of all the other members = of this > list, and nor should I. >=20 > (I think the latter comparison is fair; a couchapp BBS would be a very = sweet > thing to have) >=20 >>> (9) The _users db itself is world-readable (showing not only who = your > users >>> are, but their password hashes). Highly undesirable. >>=20 >> I actually consider this a feature. We'd like to get some stronger >> password hashing (see the bcrypt threads) which should help with the >> password parts. >=20 > At the end of the day, bcrypt is still a hash of a password. Any = password > hash is open to off-line brute-force attack. You can tune the cost = with > bcrypt, but dictionary attacks are still going to succeed for 90% of = users.=20 > You may be running couchdb on a modest server but your attacker is = thousands > of times more powerful than you, and can spend years doing it if they = want. >=20 > Put it another way: if I suggested that people should start making > /etc/shadow world-readable, people would laugh. If I suggested that = they > also post it on their public webserver, I would be laughed out of = town. >=20 > Blocking _users is probably good enough for now. I'd be more = comfortable if > _readers didn't fail-open. I'm also concerned that newcomers may not = be > impressed to find couchdb so "insecure" in its default state. >=20 > However, if _users could be blocked, but there were a restricted API = for > manipulating it (something like _update and _show handlers, allowing = users > only to see and change their own records), that would be much better = IMO. >=20 > Regards, >=20 > Brian. >=20