incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: 0.11 Release / Feature Freeze for 1.0
Date Wed, 03 Feb 2010 14:23:25 GMT
I see the readeracl branch was recently merged into trunk, and I've just
been testing it again.

My concern is that the design is flawed, and that if this goes into 0.11
then we are stuck with it forever; so it's better to address these sooner
rather than later.

I do see logic in keeping the admin/reader authorizations for a database
within the database itself. The problems are:

(1) _all_dbs currently shows everything - even those databases you don't
have access to. This leads to the following issues:

* Users won't want other users to be able to see the names of their
  databases, for privacy reasons. (Imagine what would happen if github
  revealed the names of all the private repos on it, even if you couldn't
  access the contents)

* I don't want users to know many databases I have on my box, for commercial
  reasons. (Ditto for github).

* I don't want users to have to page through loads of databases they can't
  access

* Futon constantly pops up errors about "Database information could not be
  retrieved" (although that one's easily fixable)

I believe that in its current form, _all_dbs simply won't scale to millions
of databases on a box if you want to limit it to accessible dbs only.

(2) _readers is a single monolithic object. I believe that it won't scale to
millions of users having access to the same database.

(3) _readers has no concurrency control. One admin making an ACL change in
futon (say) will silently overwrite changes made around the same time by
another admin. This will get worse the more frequently users are added and
removed.

For me, those are serious problems. I sketched a design for an alternative
approach, using the _user db to hold the authorizations in terms of
database-specific roles.  Unfortunately I didn't have time to contribute an
implementation of this.  If there's a chance this alternative approach would
be used then I will try to steal the time from somewhere.  The ideas behind
it weren't explicitly rejected, but neither were they acknowledged as a good
approach.

If the current design stays, then I think there will be sticking-plaster
solutions forever; e.g. proxies to fake out _all_dbs and ACL changes,
mapping them to a 'real' database behind.

Right, I've said all that. Now I have a few further observations from the
current implementation.

(4) An "admin" is not a "reader", and this is clearly intentional from
comments in the code. However, someone who has an "admin" role without
"reader" role is unable to perform ACL changes, which for me defeats the
whole purpose of the "admin" role.

Example: user "brianadmin" is in "_admins" on database "briantest", but not
in "_readers":

$ curl http://admin:admin@127.0.0.1:5984/briantest/_admins
{"names":["brianadmin"],"roles":[]}
$ curl http://admin:admin@127.0.0.1:5984/briantest/_readers
{"names":["brian"],"roles":[]}

But when "brianadmin" tries to update an ACL, here's what happens:

$ curl http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_readers
{"error":"unauthorized","reason":"You are not authorized to access this db."}
$ curl -X PUT -d '{"names":["foo","brian"],"roles":[]}'
http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_readers
{"error":"unauthorized","reason":"You are not authorized to access this db."}

Even if this were fixed so that a db admin had access to _readers and
_admins resources (and design docs), I think that in practice a database
administrator would be expected to have access to the database she is
administering. In that case, adding the same user to both "admin" and
"reader" roles simply involves duplicating data, as well as having to
remember to remove that user from two places when she leaves. That
introduces more scope for error.

So I'd propose that the relaxed approach is that a database "admin" should
inherit "reader" rights. Isn't that true for a server-level admin anyway?

(5) Non-admin readers can view the entire _readers, _admins and _security
resources.  I think this is quite a severe privacy concern, but it is easily
fixed.

(6) Databases are created world-readable by default, which means a race to
get the _readers set before someone else starts inserting documents. I think
a PUT /dbname option to set a non-empty readers list would be a good idea
(and a corresponding checkbox in futon)

(7) Couchdb accepts nonsense _readers documents, e.g.

$ curl -X PUT -d '{"names":{"foo":"bar"},"roles":456}'
http://admin:admin@127.0.0.1:5984/briantest/_readers
{"ok":true}

The effect is to reset the _readers document to its permit-all default, thus
opening up the database to the world.

$ curl http://127.0.0.1:5984/briantest/_readers
{"names":[],"roles":[]}

(8) Point (7) is arguably a simple bug which can be fixed, but I'd prefer
for couchdb to be fail-safe; that is, an empty ACL means nobody has access.

One way to achieve this would be for two new roles, "_anon" and "_user",
granted to all unauthenticated and authenticated users respectively. Then a
fully public database would have roles:["_anon","_user"], and this would
be added to a new database unless you ask otherwise (see point (6)).

(9) The _users db itself is world-readable (showing not only who your users
are, but their password hashes). Highly undesirable.

You can set a _readers ACL on it, but it has consequences:
* users can't sign up for new accounts
* users can't change their own passwords
This forces such things through a privileged external interface. Actually
I'm fine with that, because I want to validate signups anyway, but others
might not be.

(10) I don't think you can replicate _readers, _admins and _security, unless
(a) you are doing rsync filesystem-level replication, or (b) you explicitly
GET and PUT these resources from one DB to another. This is arguably a
feature not a bug.

(11) Trivial problem of the day: _security resources which are not objects
give an erlang-flavoured error.

$ curl -X PUT -d '["foo","bar"]' http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_security
{"error":"unknown_error","reason":"function_clause"}

Sorry for the long post, and if I really am barking up the wrong tree here,
please tell me so.

Regards,

Brian.

Mime
View raw message