couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <>
Subject DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)
Date Wed, 03 Feb 2010 17:21:10 GMT
On Wed, Feb 3, 2010 at 6:23 AM, Brian Candler <> wrote:
> I see the readeracl branch was recently merged into trunk, and I've just
> been testing it again.
> My concern is that the design is flawed, and that if this goes into 0.11
> then we are stuck with it forever; so it's better to address these sooner
> rather than later.

Let me see if I can address some of these concerns.

> I do see logic in keeping the admin/reader authorizations for a database
> within the database itself. The problems are:
> (1) _all_dbs currently shows everything - even those databases you don't
> have access to.
> I believe that in its current form, _all_dbs simply won't scale to millions
> of databases on a box if you want to limit it to accessible dbs only.

This is an interesting one. _all_dbs won't scale indefinitely even
before this patch, because it has no built-in pagination abilities.
Enhancing this feature to look into each file and keep going to till
it finds N that can be listed isn't hard to code. It will be a little
more work to make _all_dbs respect startkey and endkey.

I'd be thrilled if someone would patch _all_dbs to act correctly.

> (2) _readers is a single monolithic object. I believe that it won't scale to
> millions of users having access to the same database.

It's not meant to support this use case. If you have millions of users
with the same access rights, give them a common role and give that
role access to the database.

> (3) _readers has no concurrency control. One admin making an ACL change in
> futon (say) will silently overwrite changes made around the same time by
> another admin. This will get worse the more frequently users are added and
> removed.

_readers / _admins / _security are stored as a raw object without
concurrency control, because keeping them as a document adds too much
performance overhead on each request. Concurrency control is a
tradeoff we make here.

> For me, those are serious problems. I sketched a design for an alternative
> approach, using the _user db to hold the authorizations in terms of
> database-specific roles.  Unfortunately I didn't have time to contribute an
> implementation of this.  If there's a chance this alternative approach would
> be used then I will try to steal the time from somewhere.  The ideas behind
> it weren't explicitly rejected, but neither were they acknowledged as a good
> approach.

The database-specfic roles and names don't belong in the users db. The
users db is for answering the question: "who is the user and what
roles do they have". The ACLs say which names and roles can read or
admin a given database.

It's a fact of life that users can rsync db-files around. If the names
/ roles are in the users db, they get wrong when databases are moved
to another host or renamed on the current host.

> (4) An "admin" is not a "reader", and this is clearly intentional from
> comments in the code. However, someone who has an "admin" role without
> "reader" role is unable to perform ACL changes, which for me defeats the
> whole purpose of the "admin" role.
> So I'd propose that the relaxed approach is that a database "admin" should
> inherit "reader" rights. Isn't that true for a server-level admin anyway?

I considered this, but dropped it because it seemed like it'd send the
wrong message (if you put yourself as the only db-admin, this doesn't
make the db private, to do that you need to edit the readers.)

Looking at your shell transcript makes me think it'd make life easier
if we grant db-admins the reader privilege as well (so long as it
remains clear that dbs with admins but no readers are public).

4 is fixed.

> (5) Non-admin readers can view the entire _readers, _admins and _security
> resources.  I think this is quite a severe privacy concern, but it is easily
> fixed.

They can also read the design document. I'm not sure why this is a
privacy concern. A user may need to contact a db admin for help with
something, it's handy to be able to get a list of them. And it only
makes sense that you should be able to see the list of users who can
also access the same db you can.

If there's consensus that this is indeed an issue, it's not a hard
thing to change in the code.

> (6) Databases are created world-readable by default, which means a race to
> get the _readers set before someone else starts inserting documents. I think
> a PUT /dbname option to set a non-empty readers list would be a good idea
> (and a corresponding checkbox in futon)

Yes, I'd like to add some UI to the create-database form in Futon that
immediately sets a reader == the current user. The UI would be a
checkbox that says "make database private" or something. Making this
synchronous with db-creation would be a neat trick, too.

> (7) Couchdb accepts nonsense _readers documents, e.g.
> $ curl -X PUT -d '{"names":{"foo":"bar"},"roles":456}'
> http://admin:admin@
> {"ok":true}
> The effect is to reset the _readers document to its permit-all default, thus
> opening up the database to the world.
> $ curl
> {"names":[],"roles":[]}

My main concern in coding this was validating that malformed data
can't be stored. I've fixed this to throw errors in the case of
malformed data (instead of ignoring it).

> (8) Point (7) is arguably a simple bug which can be fixed, but I'd prefer
> for couchdb to be fail-safe; that is, an empty ACL means nobody has access.

The goal with this patch was to add some easy ACL functionality
without changing the default behavior. People who are happy to use
Couch in admin-party mode shouldn't see any changes.

> One way to achieve this would be for two new roles, "_anon" and "_user",
> granted to all unauthenticated and authenticated users respectively. Then a
> fully public database would have roles:["_anon","_user"], and this would
> be added to a new database unless you ask otherwise (see point (6)).

This ["_anon","_user"] business seems fine to me. It's a lot more
complexity for basically the same capabilities (although I do like the
ability to lock out anon users on a per-db basis, that's pretty cool.)

If anyone writes this, I'd be happy to see it. This won't be a small
patch because it's got to introduce those roles as well as use them in
the reader checks.

> (9) The _users db itself is world-readable (showing not only who your users
> are, but their password hashes). Highly undesirable.

I actually consider this a feature. We'd like to get some stronger
password hashing (see the bcrypt threads) which should help with the
password parts.

> You can set a _readers ACL on it, but it has consequences:
> * users can't sign up for new accounts
> * users can't change their own passwords
> This forces such things through a privileged external interface. Actually
> I'm fine with that, because I want to validate signups anyway, but others
> might not be.
> (10) I don't think you can replicate _readers, _admins and _security, unless
> (a) you are doing rsync filesystem-level replication, or (b) you explicitly
> GET and PUT these resources from one DB to another. This is arguably a
> feature not a bug.
> (11) Trivial problem of the day: _security resources which are not objects
> give an erlang-flavoured error.
> $ curl -X PUT -d '["foo","bar"]' http://brianadmin:brianadmin@
> {"error":"unknown_error","reason":"function_clause"}
> Sorry for the long post, and if I really am barking up the wrong tree here,
> please tell me so.

Thanks for pointing out a lot of small bugs. The ones I think have merit are:

1, 4, 6, 7, 8, 11

Of these I've fixed 4, 7 and 11 this morning.

If you can create Jira tickets for 1, 6, and 8 that'd be nice.


> Regards,
> Brian.

Chris Anderson

View raw message