jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gritsenko <va...@reverycodes.com>
Subject Re: Is JDBC persistence manager supported by jackrabbit?
Date Thu, 01 Sep 2005 16:27:42 GMT
Edgar Poce wrote:
> Hi vadim
> Vadim Gritsenko wrote:
>> Was trying to find more information following your references, but...
>>> [1] http://thread.gmane.org/gmane.comp.apache.jackrabbit.devel/1435
>> Points to JIRA which states [1]:
>>    Comment by Edgar Poce [12/Jul/05 06:00 AM]
>>    This kind of approach is discouraged by design
>> Can you please clarify your point? 
> There are a couple of conversations in the archive about this. My point 
> is that the PM contract is not suitable for mapping the itemstates into 
> a relational database with a table design that breaks the ItemState into 
> its constituent parts.

Ok. Breaking ItemState into parts would tie storage layer to the ItemState 
structure - which would make it impossible to make changes in structure or 
hierarchy of ItemStates...

(Can one (potentially) have custom ItemState(s)?)

But - this makes me wonder why OJB PM and Hibernate PM are considered to be 
acceprable design even though they are implemented exactly in the same way - 
they break up ItemState into parts!?

> The PM is intended to keep it simple, which means 
> to store the itemstate as a whole without interpreting the data. See the 
> jdbc pm under contrib.

Yep, saw that. And was puzzled by OJB/Hibernate.

> The main problem to store the itemstates in a complex schema is the 
> Collection handling. Since Collection fields changes are not logged into 
> add/update/remove aware objects, all the elements in the Collection must 
> be stored on each write call. It causes a hit on performance when 
> handling collections with lots of elements, even with the simple PMs 
> included in the core.

Saw it in storeChildNodeEntries - yep, it sure should be slow.

> see the second chart in http://issues.apache.org/jira/browse/JCR-188. In 
> my PIV box with Object PM + cqfs, any write operation (e.g. set a 
> property) takes up to half a sec when the given node reaches 3k children.
> If I tried to run the same test with the impl proposed in jcr-91, the 
> half sec mark would be reached much sooner than with 3k children, just a 
> hundred children would make the repo unbearably slow.
> when I decided to write the jdbc pm proposed in jcr-91 I wanted:
> 1 - a mature, transactional and scalable persistence storage
> 2 - use rdbms administrative tools, like scheduled backups, etc.
> 3 - rdbms referential integrity
> 4 - avoid redundancy. PMs store the NodeReferences twice.
> 5 - a storage that allows to modify the data easily, just in case.

I need at least 1, 2, and clustering on top of that... None of existing PMs will 
work in cluster environment (OJB and Hibernate do not count).

> But in order to achieve the above goals the PM should interpret the data 
> :(. Maybe we can bring this up again after the first release ...

Why wait release? :-) Isn't code in contrib meant to be grounds for experimental 
code? :-) Let's bring it up before that - SimpleDB isn't usable as well:

   * Synchronized to death
   * Stored BLOBs locally

>> Or, may be point to the document /
>> discussion regarding the design?
> Even when it's not directly related you might want to take a look to the 
> Dominique's post about jackrabbit internals. See 
> http://article.gmane.org/gmane.comp.apache.jackrabbit.devel/1223

I remember seeing this post some time ago :-)

>>> [2] http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ
>> Points to Wiki page which does not clarify your POV either. 
> It's not my point of view. I just collected the devs opinions on this 
> issue from the mailing list. If it's not clear please trace the 
> conversations in the archive and clarify it.

Tried to do that to no avail. Searches for 'JDBC', 'DB' do not give much.

>> It states though:
>>    The PM interface was never intended as being a general SPI that
>>    you could implement in order to integrate external datasources
>>    with proprietary formats (e.g. a customers database).
>> This raises the question, what is the recommended SPI to code against?
> I think that the jcr-ext project under contrib might be a good starting 
> point. Or, despite the PM is not intended to be a SPI, you can handle to 
> plug your legacy data if you do it carefully.

Thanks for pointers. Do you suggest to use decorators? I don't see though how 
they could be plugged in into the jackrabbit...


View raw message