jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Ryan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-8186) Create API in OAK for file access to binaries in the repository.
Date Fri, 05 Apr 2019 15:22:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810934#comment-16810934

Matt Ryan commented on OAK-8186:

I had a discussion with [~hsaginor@gmail.com] offline yesterday to work through some of the
questions.  To summarize the discussion, what we determined is that the intent of this proposal
is to allow *processing of a binary in a more efficient means* than streaming the binary through
the JVM.  We clarified the following points:

* The proposal applies only to Oak instances using FileDataStore.
* Oak will not provide direct access to any file.  The proposal must only be about access
to a copy of the file, created in a temporary location.
* The access is effectively read-only, meaning that Oak will not directly apply any changes
made to the file.  If changes are made that the user wishes to apply, the changed binary must
be applied as an update via existing JCR API.

Again, the intent of the proposal is to allow the creation of the temporary file and third-party
access to and processing of the file via more efficient means than streaming the binary through
the JVM and the JCR APIs.  See [this comment|https://issues.apache.org/jira/browse/OAK-8186?focusedCommentId=16808802&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16808802]
for an example use.

I've requested that more detailed testing of the proposal be done to figure out at what point
making a file copy and working directly with the copied file is more efficient overall than
using the existing supported approach.  Having more data should help validate the justification
(or not).

Some open questions:
* Who should be responsible to delete the temporary file after use?  It seems to me the client
should; the client knows when it is no longer needed.  I don't want to burden Oak with the
responsibility to delete the temporary files.
* If we were to implement such a feature, would we limit it to FileDataStore or also support
it for the cloud data stores?  The same use case would apply either way.  Clients could of
course use the direct download URI for cloud data stores to make their own temp file, but
in theory Oak could also provide a single API for creating the temp file and for cloud data
stores use the direct download API to make the temp copy.
 ** Personally I'm less worried about how to do it for cloud data stores and more worried
about whether we should do it at all.  The cloud data stores with direct binary access have
the effect of moving a lot of the binary state off of the Oak instance; creating a temp file
seems a step backwards.

Comments?  /cc [~mduerig]/[~frm]/[~teofili]

> Create API in OAK for file access to binaries in the repository.
> ----------------------------------------------------------------
>                 Key: OAK-8186
>                 URL: https://issues.apache.org/jira/browse/OAK-8186
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>            Reporter: Henry Saginor
>            Priority: Major
>         Attachments: OAK File Access.jpg
> To get file access applications normally write binaries to temp files. It would be nice
if an API existed to get file access directly from OAK. This might also meet some use cases
documented at [https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase]
> Suggested API and implementation can be found here [1]. Also, see attached diagram [2].
> I can create a patch if I can get some feedback. Note that suggested API makes it explicit
that a temp file is created. I am not sure if direct access to files in datasore would be
safe. But I am open to suggestions.
> [1]
>  [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/FileReferencable.java]
>  [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReference.java]
>  [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-api/src/main/java/org/apache/jackrabbit/oak/api/blob/TempFileReferenceProvider.java]
>  [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/FileDSBlobTempFileReference.java]
>  [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/DataStoreBlobStore.java]
>  [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentBlob.java]
>  [https://github.com/hsaginor/jackrabbit-oak/blob/directFileAccess/oak-store-spi/src/main/java/org/apache/jackrabbit/oak/plugins/value/jcr/BinaryImpl.java]
> [2]
> !OAK File Access.jpg!

This message was sent by Atlassian JIRA

View raw message