jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Klimetschek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3534) Efficient copying of binaries across repositories with the same data store
Date Tue, 07 May 2013 18:21:16 GMT

    [ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651138#comment-13651138
] 

Alexander Klimetschek commented on JCR-3534:
--------------------------------------------

The main point is actually that the signature MUST be created inside the repository implementation,
I can't see how client code would have access to the data store secret which is required to
create the signature.

This leads to changing JackrabbitValue#getContentIdentity(): I did not find a use of getContentIdentity()
in our proprietary code; jackrabbit itself only provides it, but there is no utility etc.
making use of it. I wonder what you could actually do with it so far other than checking for
equality to a content id fetched from somewhere else? But if the hmac signature is expected
to change every time, this would break the api contract afaics, which clearly states "If two
values have the same identifier, the content of the value is guaranteed to be the same". This
forces to add a new API that gives this signed message, no way around that afaics.

That would be my proposal:

- BinaryReferenceMessage just a data object holding a "token/message" string (passed via constructor
and read via getter)
- other Binary methods empty/no-op
- message format creation including signature done inside new API (e.g. JackrabbitValue#getSecureContentIdentity())
- message format parsing and signature validation all happening inside createValue(Binary)
when it sees a BinaryReferenceMessage

                
> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.patch, JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global data store
to prevent sending around and copying around large binary data unnecessarily: We have two
separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion
assume we have the problems of concurrent access and garbage collection under control). When
sending content from one instance to the other instance we don't want to send potentially
large binary data (e.g. video files) if not needed.
> The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity().
The receiver would then check whether the such content already exists and would reuse if so:
> String ci = contentIdentity_from_sender;
> try {
>     Value v = session.getValueByContentIdentity(ci);
>     Property p = targetNode.setProperty(propName, v);
> } catch (ItemNotFoundException ie) {
>     // unknown or invalid content Identity
> } catch (RepositoryException re) {
>     // some other exception
> }
> Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow
for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary
data copying and moving. 
> See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message