couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject [DISCUSS] Per-doc access control
Date Sun, 17 Feb 2019 14:25:50 GMT
Hi Everyone,

I’m happy to share my work in progress attempt to implement the per-doc access control feature
we discussed a good while ago:

https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E
<https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E>

You can check out my branch here:

https://github.com/apache/couchdb/compare/access?expand=1 <https://github.com/apache/couchdb/compare/access?expand=1>

It is very much work in progress, but it is far enough along to warrant discussion.

The main point of this branch is to show all the places that we would need to change to support
the proposal.

Things I’ve left for later:

- currently only the first element in the _access array is used. Our and/or syntax can be
added later.
- building per-access views has not been implemented yet, couch_index would have to be taught
about the new per-access-id index.
- pretty HTTP error handling
- tests except for a tiny shell script 😇

Implementation notes:

You create a database with the _access feature turned on like so:  PUT /db?access=true

I started out with storing _access in the document body, as that would allow for a minimal
change set, however, on doc updates, we try hard not to load the old doc body from the database,
and forcing us to do so for EVERY doc update under _access seemed prohibitive, so I extended
the #doc, #doc_info and #full_doc_info records with a new `access` attribute that is stored
in both by-id and by-seq. I will need guidance on how extending these records impact multi-version
cluster interop. And especially whether this is an acceptable approach.

https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-904ab7473ff8ddd07ea44aca414e3a36

* * *

The main addition is a new native query server called couch_access_native_proc, which implements
two new indexes by-access-id and by-access-seq which do what you’d expect, pass in a userCtx
and retrieve the equivalent of _all_docs or _changes, but only including those docs that match
the username and roles in their _access property. The existing handlers for _all_docs and
_changes have been augmented to use the new indexes instead of the default ones, unless the
user is an admin.

https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-fbb53323f07579be5e46ba63cb6701c4

 * * *

The rest of the diff is concerned with making document CRUD behave as you’d expect it. See
this little demonstration for what things look like:

https://gist.github.com/janl/b6d3f7502aa20b7b9ab9d9dcb8e92497 <https://gist.github.com/janl/b6d3f7502aa20b7b9ab9d9dcb8e92497>
(I’m just noticing that there might be something wonky with DELETE, but you’ll get the
gist #rimshot)

* * *

Open questions:

- The aim of this is to get as close to regular CouchDB behaviour as possible. One thing that
is new however which would require all apps to be changed is that for an _access enabled database
to include an _access field in their docs (docs with no _access are admin-only for now). We
might want to consider on new document writes to auto-insert the authenticated user’s name
as the first element in the _access array, so existing apps “just work”.

- Interplay with partitioned dbs: eschewing db-per-user is already a large boon if you have
a lot of users, but making those per-user requests inside an _access enabled database efficient
would be doubly nice, so why not use the username from the first question above and use that
as the partition key? This would work nicely for natural users with their own docs that want
to share them with others later, but I can easily imagine a pipelined use of CouchDB, where
a “collector” user creates all new docs, an “analyser” takes them over and hand them
to a “result” user for viewing. In that case, we’d violate the high-cardinality rule
of partitions (have a lot of small ones), instead all docs go through all three users. I’d
be okay with treating the later scenario as a minor use-case, but for that use-case, we should
be able to disable auto-partitioning on db creation.

- building access view indexes for docs that have frequent _access changes, lead to many orphaned
view indexes, we should look at an auto-cleanup solution here (maybe keep 1-N indexes in case
folks just swap back and forth).

* * *

I’ll leave this here for now, I’m sure there are a few more things to consider.

I’d love to hear any and all feedback you might have. Especially if anything is unclear.

Best
Jan
—
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message