jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miro Walker" <miro.wal...@cognifide.com>
Subject RE: DP Persistence manager implementation
Date Fri, 03 Feb 2006 10:42:54 GMT
I think the hope that use of a connection pool would increase
concurrency is in the case of a write-heavy system with tight
performance targets, as Serge suggested. This is the case in our current
application, and would be the case in other systems that I have been
considering for use with Jackrabbit. Examples are a heavily multi-user
authoring tool or a document management system, where write performance
and concurrency are critical.

I really like the idea of jackrabbit / JCR becoming an RDBMS-grade
repository, but it's a long way from there now and in fact a key
motivation of companies I've spoken to about using JCR has been around
their need to get away from using filesystems to store data. Many
organisations see RDBMS support as a key feature of jackrabbit et al.,
and JCR as a mechanism to allow them to make best use of their existing
investment in Oracle, MS SQL Server, etc. while still providing a
content-oriented interface. So, jackrabbit starts to find a place as a
webdav to RDBMS interface layer, for example. 


-----Original Message-----
From: Serge Huber [mailto:shuber2@jahia.com] 
Sent: 03 February 2006 10:35
To: jackrabbit-dev@incubator.apache.org
Subject: Re: DP Persistence manager implementation

Just on a quick note, from what I understand of Jackrabbit's main 
design, is that it is optimized for usage scenarios where you :

read-a-lot, write-a-little

Or rather that your read to write ratio is large. In your scenario it 
seems that you are writing quite often or is this a misconception ?

Apart from that, I do agree that adding too much logic to a "Simple" DB 
PM would not be a good idea. We could indeed add more DB PMs, possibly 
without copy-pasting DB code. A modification to allow for JNDI lookup 
though would not be significant.

As for thread-writing and such, you have to be careful with transaction 
semantics, which *must* be preserved. I think one interesting thing to 
look into might be optimistic locking, where all the locking and object 
changes checks are made at the end of a transaction, with the 
possibility of the transaction to be rejected, but of course this 
depends on the usage scenario and whether this is acceptable or not.

  Serge Huber.

Przemyslaw Pakulski wrote:
> Marcel Reutegger <[EMAIL PROTECTED]> wrote:
>> this is not quite true. the actual store operation on the persistence

>> manager is synchronized. however most of the write calls from 
>> different threads to the JCR api in jackrabbit will not block each 
>> other because those changes are made in a private transient scope. 
>> only the final save or commit of the transaction is serialized. 
>> that's only one part of the whole write process
> But in fact in real application most of business method ends with 
> either save, checkin or commit, and in consequence concurrent calls of

> this methods will block each other and wait for storing modified data.
> We are using versioning feature intensively, and we have performance 
> problems mainly with write operations. Additionally we notice big 
> performance degradation when we switch to SimpleDBPersistenceManager 
> with MySQL or other db using network communications. So it looks like 
> overall performance depends much on PM implementation because all 
> save/checkin operations wait for PM until he finish all his work.
> One solution to avoid blocking write operations could be special 
> thread/s responsible for flushing data to PM, but i don't think so 
> that Jackrabbit uses asynchronous processing.
>> even if such a persistence manager allows concurrent writes, it is 
>> still the responsibility of the caller to ensure consistency. in our 
>> case that's the SharedItemStateManager. And that's the place where 
>> transactions are currently serialized, but only on commit.
>> If concurrent write performance should become a real issue that's 
>> where we first have to deal with it.
> If there exists any singleton component on top of PM, which is 
> reponsible for serializing all saves, checkins or transactions then 
> naturally using connection pools doesn't help, but maybe it means that

> Jackrabbit is not designed to work effectively in multithreaded 
> environment.
> Even if usage of connection pool is not reasonable in current design, 
> I think it is worth to consider JDBC batch updates instead of single 
> updates to gain better DBPM performance.
> Regards
> Przemo Pakulski
> www.cognifide.com

View raw message