jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] Updated: (JCR-926) Global data store for binaries
Date Fri, 22 Jun 2007 09:16:26 GMT

     [ https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Mueller updated JCR-926:
-------------------------------

    Attachment: internalValue.patch

This is a patch to clean up InternalValue.internalValue():

Before replacing the BLOBFileValue class with the new 'Global Data Store' implementation,
I wanted to clean up a few things in InternalValue. The method InternalValue.internalValue()
was called a lot in the Jackrabbit code. It returns a java.lang.Object which was then cast
to the required class. This has a few disadvantages:

- Unnecessary casts in many places
- Hard to change the internal working of InternalValue, specially replace BLOBFileValue
- A few times, 'instanceof' was used, making it hard to change BLOBFileValue
- For developers new to the Jackrabbit code (like me), it's not always easy to understand
what is going on in the code
- NodeIndexer used the java.lang.Object directly, assuming the implementation will always
use Boolean, Long, Double, BLOBFileValue and so on objects.

In this patch, I added specific getter methods to InternalValue, like done in the Value interface.
Additionally, there are getPath (for PropertyType.PATH), getQName (for PropertyType.NAME),
and getUUID (for PropertyType.REFERENCE).

I had to make a few assertions, some of them were not 100% clear from the code, so could you
please review them:

- The 'value' of InternalValue is never 'null'. 
    ValueConstraint was checking for 'null', but as far as I see 
  it is never really possible to have a 'null' value.
- The type of QName.JCR_FROZENUUID is STRING (Object.toString() was used before).
- The type of QName.JCR_MIMETYPE is STRING
- The type of QName.JCR_ENCODING is STRING
- Currently, for types PropertyType.BINARY, the object is always 
  a BLOBFileValue (there was no other constructor for PropertyType.BINARY)

NodeIndexer still has a few unnecessary type casts (addBinaryValue, addBinaryValue,...) but
the methods are protected, and I was afraid to change them right now. I hope those can be
changed soon to avoid a few unnecessary casts and conversions (Long, Double).

There are no functional changes yet in this patch (as far as I see). But I think this patch
is required, otherwise subsequent changes will be much harder.

I didn't remove InternalValue.internalValue so far, but set it to 'deprecated'. I hope it
can be removed in the near future.
 
Thomas


> Global data store for binaries
> ------------------------------
>
>                 Key: JCR-926
>                 URL: https://issues.apache.org/jira/browse/JCR-926
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Jukka Zitting
>         Attachments: DataStore.patch, DataStore2.patch, internalValue.patch, ReadWhileSaveTest.patch
>
>
> There are three main problems with the way Jackrabbit currently handles large binary
values:
> 1) Persisting a large binary value blocks access to the persistence layer for extended
amounts of time (see JCR-314)
> 2) At least two copies of binary streams are made when saving them through the JCR API:
one in the transient space, and one when persisting the value
> 3) Versioining and copy operations on nodes or subtrees that contain large binary values
can quickly end up consuming excessive amounts of storage space.
> To solve these issues (and to get other nice benefits), I propose that we implement a
global "data store" concept in the repository. A data store is an append-only set of binary
values that uses short identifiers to identify and access the stored binary values. The data
store would trivially fit the requirements of transient space and transaction handling due
to the append-only nature. An explicit mark-and-sweep garbage collection process could be
added to avoid concerns about storing garbage values.
> See the recent NGP value record discussion, especially [1], for more background on this
idea.
> [1] http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/%3c510143ac0705120919k37d48dc1jc7474b23c9f02cbd@mail.gmail.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message