Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@jackrabbit.apache.org
Date: Tue, 19 Mar 2013 12:05:15 +0000 (UTC)
From: "Alexander Klimetschek (JIRA)" <jira@apache.org>
To: dev@jackrabbit.apache.org
Message-ID: <JIRA.12637199.1363355457852.10672.1363694715902@arcas>
In-Reply-To: <JIRA.12637199.1363355457852@arcas>
References: <JIRA.12637199.1363355457852@arcas>
Subject: [jira] [Commented] (JCR-3534) Add
 JackrabbitSession.getValueByContentId method
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606270#comment-13606270 ] 

Alexander Klimetschek commented on JCR-3534:
--------------------------------------------

I agree that the security aspect of this seems problematic - but I think they go away if we look at the larger picture.

What we want to use this feature for is for a "replication" feature in our app, which is really an infrastructure service and uses an *admin* (or similar highly privileged user) session on the target jackrabbit instance to copy over content from another one. Having this special "access-by-datastore-id" permission flag is actually slightly better than before with no restrictions at all.

Now if we ensure the data store IDs are not guessable, then there is no option to browse the repository's binaries. If we avoid using simple hashes of the binary for the (exposed) ID, it will not be possible to check for the existence of certain documents known to an attacker. In fact, an attacker will only be able to get to the ID if he has the access rights to the content in the first place.

IMHO that is all acceptable; except for such a special replication system user, no normal JCR user would ever need to have that permission turned on (and documentation should say so).

If not, we really need to think of bringing a similar performance-optimized replication feature into Jackrabbit / Oak itself.
                
> Add JackrabbitSession.getValueByContentId method
> ------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>         Attachments: JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed.
> The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so:
> String ci = contentIdentity_from_sender;
> try {
>     Value v = session.getValueByContentIdentity(ci);
>     Property p = targetNode.setProperty(propName, v);
> } catch (ItemNotFoundException ie) {
>     // unknown or invalid content Identity
> } catch (RepositoryException re) {
>     // some other exception
> }
> Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. 
> See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira