couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Garren Smith <gar...@apache.org>
Subject Re: [DISCUSS] Per-doc access control
Date Tue, 26 Feb 2019 10:18:45 GMT
Hi Jan,

I've been giving this some thought and I wonder if we should take a step
back and rethink how we do this. Instead of implementing this directly into
the CouchDB core code, it might be better to write this as an application
similar to Dreyfus - Cloudant's search[1]. Instead of writing this code
directly in the core CouchDB code rather we write this as another
application. I'm hoping then that you wouldn't have to make huge
modifications to the CouchDB codebase which should make this easier to do.
The application would override the _all_docs and _changes endpoints, and if
a user has enabled access=true for that database then you could then return
the _all_docs and _changes requests from your application. The epi http
work is pretty fancy I think we could do some cool things around that to
make this work well. The app would listen to the changes feeds of any
database that has access=true and then implement the required index's for
_all_docs and changes. I think we then would not have to create a custom
indexer as we could build the indexes when new changes arrive.

I'm also hoping that another advantage of doing this as an app that listens
to the changes feed is that there should be minimal work to get this to
work when we switch to fdb.

This is obviously just an idea I had and I thought I would share it, not in
an attempt to derail what you doing, but hopefully in an attempt to make
sure we find the easiest and most effective way to get this done.

Cheers
Garren


[1] https://github.com/cloudant-labs/dreyfus

On Sun, Feb 17, 2019 at 4:25 PM Jan Lehnardt <jan@apache.org> wrote:

> Hi Everyone,
>
> I’m happy to share my work in progress attempt to implement the per-doc
> access control feature we discussed a good while ago:
>
>
> https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E
> <
> https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E
> >
>
> You can check out my branch here:
>
> https://github.com/apache/couchdb/compare/access?expand=1 <
> https://github.com/apache/couchdb/compare/access?expand=1>
>
> It is very much work in progress, but it is far enough along to warrant
> discussion.
>
> The main point of this branch is to show all the places that we would need
> to change to support the proposal.
>
> Things I’ve left for later:
>
> - currently only the first element in the _access array is used. Our
> and/or syntax can be added later.
> - building per-access views has not been implemented yet, couch_index
> would have to be taught about the new per-access-id index.
> - pretty HTTP error handling
> - tests except for a tiny shell script 😇
>
> Implementation notes:
>
> You create a database with the _access feature turned on like so:  PUT
> /db?access=true
>
> I started out with storing _access in the document body, as that would
> allow for a minimal change set, however, on doc updates, we try hard not to
> load the old doc body from the database, and forcing us to do so for EVERY
> doc update under _access seemed prohibitive, so I extended the #doc,
> #doc_info and #full_doc_info records with a new `access` attribute that is
> stored in both by-id and by-seq. I will need guidance on how extending
> these records impact multi-version cluster interop. And especially whether
> this is an acceptable approach.
>
>
> https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-904ab7473ff8ddd07ea44aca414e3a36
>
> * * *
>
> The main addition is a new native query server called
> couch_access_native_proc, which implements two new indexes by-access-id and
> by-access-seq which do what you’d expect, pass in a userCtx and retrieve
> the equivalent of _all_docs or _changes, but only including those docs that
> match the username and roles in their _access property. The existing
> handlers for _all_docs and _changes have been augmented to use the new
> indexes instead of the default ones, unless the user is an admin.
>
>
> https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-fbb53323f07579be5e46ba63cb6701c4
>
>  * * *
>
> The rest of the diff is concerned with making document CRUD behave as
> you’d expect it. See this little demonstration for what things look like:
>
> https://gist.github.com/janl/b6d3f7502aa20b7b9ab9d9dcb8e92497 <
> https://gist.github.com/janl/b6d3f7502aa20b7b9ab9d9dcb8e92497> (I’m just
> noticing that there might be something wonky with DELETE, but you’ll get
> the gist #rimshot)
>
> * * *
>
> Open questions:
>
> - The aim of this is to get as close to regular CouchDB behaviour as
> possible. One thing that is new however which would require all apps to be
> changed is that for an _access enabled database to include an _access field
> in their docs (docs with no _access are admin-only for now). We might want
> to consider on new document writes to auto-insert the authenticated user’s
> name as the first element in the _access array, so existing apps “just
> work”.
>
> - Interplay with partitioned dbs: eschewing db-per-user is already a large
> boon if you have a lot of users, but making those per-user requests inside
> an _access enabled database efficient would be doubly nice, so why not use
> the username from the first question above and use that as the partition
> key? This would work nicely for natural users with their own docs that want
> to share them with others later, but I can easily imagine a pipelined use
> of CouchDB, where a “collector” user creates all new docs, an “analyser”
> takes them over and hand them to a “result” user for viewing. In that case,
> we’d violate the high-cardinality rule of partitions (have a lot of small
> ones), instead all docs go through all three users. I’d be okay with
> treating the later scenario as a minor use-case, but for that use-case, we
> should be able to disable auto-partitioning on db creation.
>
> - building access view indexes for docs that have frequent _access
> changes, lead to many orphaned view indexes, we should look at an
> auto-cleanup solution here (maybe keep 1-N indexes in case folks just swap
> back and forth).
>
> * * *
>
> I’ll leave this here for now, I’m sure there are a few more things to
> consider.
>
> I’d love to hear any and all feedback you might have. Especially if
> anything is unclear.
>
> Best
> Jan
> —

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message