jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "angela (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
Date Tue, 19 Mar 2013 12:33:18 GMT

    [ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606277#comment-13606277
] 

angela commented on JCR-3534:
-----------------------------

> What we want to use this feature for is for a "replication" feature in our app, which
is really an infrastructure service and 
> already uses an *admin* (or similar highly privileged user) session on the target jackrabbit
instance to copy over content from 
> another one. Having this special "access-by-datastore-id" permission flag would only
be used for that case anyway. 

so, you are justifying something that looks problematic security wise with the argument that's
just one specific use-case
and 'admin-only' :-)

IMHO we should not hack the built-in permissions for something that looks like a nice feature
for us without thinking
about the consequences. claiming that only our replication user would be allowed to do this
as naive as claiming that
something was just an 'admin-only' task. that's not how the repository is being used and if
it was we could equally
just hardcode access to the getValueById to a single, dedicated user (which obviously is a
bad idea).

> Now if we ensure the data store IDs are not guessable, then there is no option to browse
the repository's binaries. 
> If we avoid using simple hashes of the binary for the (exposed) ID, it will not be possible
to check for the existence 
> of certain documents known to an attacker. In fact, an attacker will only be able to
get to the ID if he has the access 
> rights to the content in the first place. 

that's correct and i don't see a problem with this part.
what is problematic IMO is the fact that once you get access to the content ID you may be
able to look at the binary 
irrespective of the accessibility of the property (or properties) that hold(s) this value.


in other words: what we are adding here is a additional dimension to the way how access control
is used and enforced 
by the repository. we have permissions on nodes and properties and we are extending this to
values irrespective of 
which property this value was attached to.

if we add the contentId handling to the API, it will be used (i see the service coming that
exposes contentIDs
with admin session which is searchable and where googles inurl will be a perfect fit to determine
all kind of
contentIds all over the world :-)

don't get me wrong: i am not opposed to have this in general but i am totally opposed to just
hacking that in without 
having a clear picture of what we are doing and careful reevaluation on what that actually
means for our threat model 
and for the further development (including oak).

> If not, we really need to think of bringing a similar performance-optimized replication
feature into Jackrabbit / Oak itself. 

again: no objection to this.... but 'thinking' is definitely the key word here :-)

                
> Add JackrabbitSession.getValueByContentId method
> ------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>         Attachments: JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global data store
to prevent sending around and copying around large binary data unnecessarily: We have two
separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion
assume we have the problems of concurrent access and garbage collection under control). When
sending content from one instance to the other instance we don't want to send potentially
large binary data (e.g. video files) if not needed.
> The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity().
The receiver would then check whether the such content already exists and would reuse if so:
> String ci = contentIdentity_from_sender;
> try {
>     Value v = session.getValueByContentIdentity(ci);
>     Property p = targetNode.setProperty(propName, v);
> } catch (ItemNotFoundException ie) {
>     // unknown or invalid content Identity
> } catch (RepositoryException re) {
>     // some other exception
> }
> Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow
for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary
data copying and moving. 
> See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message