From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Rethinking access control evaluation
Date Fri, 04 Oct 2013 19:04:03 GMT

I was looking at OAK-1046 and OAK-774, and thinking about how we could
avoid the current heavy performance hit on access control evaluation.

I think the various caching approaches suggested in the above issues
and in other discussions are just addressing the symptoms of the
problem instead of the root cause, which I believe is the way we
currently do permission lookups.

For example, consider a simple content tree with a node like
/content/site/page/paragraph, with an ACL at /content/site that grants
everyone read access to that subtree. Assuming no other applicable
ACLs, currently (AFAICT) each property access on the paragraph node
will require permission store lookups for
/content/site/page/paragraph, /content/site/page, /content/site,
/content and / for all principals of the user until a match is found
for the everyone principal at /content/site. For a simple case with
just one user and one group principal before everyone, that's still 12
failed lookups before the match is found. The number can get a lot
higher for a deeper tree or a user with more principals. And that work
is done over and over again for each property that is being accessed.

Given such an access pattern it's no wonder that the access control
evaluation is so expensive. We could of course speed things up a lot
by caching the results of frequent lookups, but I think there's a much
simpler and more elegant solution.

Instead of repeatedly looking up things from the permission store
using a sequence of (principal, ancestorPath) keys for each property
being accessed, I suggest that we collect all the potentially
applicable ACEs along the path as we descend it through successive
SecurityContext.getChildContext() calls. This way when accessing
/content/site/page/paragraph, we'd end up looking up ACEs from /,
/content, /content/site, /content/site/page and
/content/site/page/paragraph. Most of those lookups could be
short-circuited by noticing that there is no rep:policy child node.
Once that's done (and since NodeStates are immutable) we'd already
know that the only matching ACE grants us read access, and thus the
extra overhead for reading properties of the paragraph node would be
essentially zero.

Such an approach adds some up-front cost in the form of the ACE
lookups during path evaluation, but that cost is quickly amortized
since it's easy to memorize the results and since typical client
access patterns are highly localized. For example after accessing the
paragraph node reading /content/site/page/paragraph2 would only
require one extra lookup for ACEs under paragraph2, which also could
be short-circuited in the common case that there's no rep:policy child


Jukka Zitting

