jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pablo Rios (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-926) Global data store for binaries
Date Thu, 21 Jun 2007 02:35:26 GMT

    [ https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506744

Pablo Rios commented on JCR-926:

I ran ReadWhileSaveTest test with the patch applied using the FineGrainedISMLocking many times,
and the results are completely different from the former one, showing in each case that this
locking strategy clearly improves concurrency.

These are the output of two test runs:

Wed Jun 20 19:03:54 PDT 2007 - setProperty() - 1
Wed Jun 20 19:04:14 PDT 2007 - begin save() - 186
Wed Jun 20 19:04:36 PDT 2007 - end save() - 402
numReads: 403

Wed Jun 20 19:18:31 PDT 2007 - setProperty() - 1
Wed Jun 20 19:18:49 PDT 2007 - begin save() - 175
Wed Jun 20 19:19:09 PDT 2007 - end save() - 373
numReads: 373

In all the runs I got similar results.


> Global data store for binaries
> ------------------------------
>                 Key: JCR-926
>                 URL: https://issues.apache.org/jira/browse/JCR-926
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Jukka Zitting
>         Attachments: DataStore.patch, DataStore2.patch, ReadWhileSaveTest.patch
> There are three main problems with the way Jackrabbit currently handles large binary
> 1) Persisting a large binary value blocks access to the persistence layer for extended
amounts of time (see JCR-314)
> 2) At least two copies of binary streams are made when saving them through the JCR API:
one in the transient space, and one when persisting the value
> 3) Versioining and copy operations on nodes or subtrees that contain large binary values
can quickly end up consuming excessive amounts of storage space.
> To solve these issues (and to get other nice benefits), I propose that we implement a
global "data store" concept in the repository. A data store is an append-only set of binary
values that uses short identifiers to identify and access the stored binary values. The data
store would trivially fit the requirements of transient space and transaction handling due
to the append-only nature. An explicit mark-and-sweep garbage collection process could be
added to avoid concerns about storing garbage values.
> See the recent NGP value record discussion, especially [1], for more background on this
> [1] http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/%3c510143ac0705120919k37d48dc1jc7474b23c9f02cbd@mail.gmail.com%3e

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message