jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-8167) With uneven distribution of ACL restriction across facet labels statistical facet count become too inaccurate
Date Tue, 23 Apr 2019 14:01:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824155#comment-16824155

Thomas Mueller commented on OAK-8167:

I updated the documentation of facets in http://svn.apache.org/r1858008. Now (in my view)
the security aspects are clearly documented. See http://jackrabbit.apache.org/oak/docs/query/lucene.html#facets
"Warning: this setting potentially leaks repository information the user that runs the query
may not see" [~anchela] do you think this is sufficient? 

I also documented the unfortunate drawback of the sampling method that is the motivation for
this issue, [~kexu] - "Do note that the beauty of sampling is that a sample size of 1000 has
an error rate of 3% with 95% confidence, if ACLs are evenly distributed over the sampled data.
However, often ACLs are not evenly distributed." (Technically, for the low error rate, the
ACLs would also need to be _independent_ of the PRNG used for sampling, but in practise I
don't think that's an issue).

That done, I don't see a way to improve the situation if ACLs are _not_ evenly distributed.
So I'm afraid we will have to close this issue as "Won't fix".

> With uneven distribution of ACL restriction across facet labels statistical facet count
become too inaccurate
> -------------------------------------------------------------------------------------------------------------
>                 Key: OAK-8167
>                 URL: https://issues.apache.org/jira/browse/OAK-8167
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene, query
>    Affects Versions: 1.6.16
>            Reporter: Kelvin Xu
>            Priority: Major
>              Labels: vulnerability
> With the statistical mode, facet count is updated proportionally to the percentage of
accessible samples, which works for secured contents scattered across different facets. For
edge case where the whole facet (results) is not accessible, the count still shows a number
after the sampling percent is applied. Even if the number is small, user experience is misleading/inaccurate
as nothing would return when the facet is clicked (applied as a query condition).
> For example, a ACLs/CUGs guarded "private" folder, in which all the assets are tagged
with the same facet value. Non authorized user may still see this facet with a count but gets
nothing when clicking on the facet.

This message was sent by Atlassian JIRA

View raw message