jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3534) Efficient copying of binaries across repositories with the same data store
Date Wed, 08 May 2013 09:59:20 GMT

    [ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651755#comment-13651755
] 

Marcel Reutegger commented on JCR-3534:
---------------------------------------

Summary of an offline discussion with Tommaso, Alex, Jukka, Thomas and Marcel:

- Introduce new interface to Jackrabbit API: ReferenceBinary extending Binary with getReference
method that returns a String
- Add SimpleReferenceBinary class to jcr-commons implementing the above. The SimpleReferenceBinary
methods will throw an exception (IllegalState or Repository) when accessing Binary interface
methods
- SimpleReferenceBinary is created by supplying a String which is identifier+HMAC
- ReferenceBinary will be created without supplying the HMAC from external. We will provide
a mechanism to get a reference from a blob in store
- Add a new method getReference to DataIdentifier (jackrabbit-core) which returns a String
of the reference
- Create AbstractDataStore from which all DataStore impls extend
- Provide a mechanism in the AbstractDataStore that creates an identifier from the reference
- The implementations of Binary need to also implement the ReferenceBinary interface if the
binary supports a reference

Client code will then use it like this:
- Create the ReferenceBinary on sender by checking Binary instanceOf ReferenceBinary
- On receiver side we create a SimpleReferenceBinary from the reference and create a value
using ValueFactory.createValue(Binary) and one passes the SimpleReferenceBinary

Comments welcome.
                
> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.patch, JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global data store
to prevent sending around and copying around large binary data unnecessarily: We have two
separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion
assume we have the problems of concurrent access and garbage collection under control). When
sending content from one instance to the other instance we don't want to send potentially
large binary data (e.g. video files) if not needed.
> The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity().
The receiver would then check whether the such content already exists and would reuse if so:
> String ci = contentIdentity_from_sender;
> try {
>     Value v = session.getValueByContentIdentity(ci);
>     Property p = targetNode.setProperty(propName, v);
> } catch (ItemNotFoundException ie) {
>     // unknown or invalid content Identity
> } catch (RepositoryException re) {
>     // some other exception
> }
> Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow
for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary
data copying and moving. 
> See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message