jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miro Walker" <miro.wal...@cognifide.com>
Subject RE: DP Persistence manager implementation
Date Fri, 03 Feb 2006 10:42:54 GMT
I think the hope that use of a connection pool would increase
concurrency is in the case of a write-heavy system with tight
performance targets, as Serge suggested. This is the case in our current
application, and would be the case in other systems that I have been
considering for use with Jackrabbit. Examples are a heavily multi-user
authoring tool or a document management system, where write performance
and concurrency are critical.

I really like the idea of jackrabbit / JCR becoming an RDBMS-grade
repository, but it's a long way from there now and in fact a key
motivation of companies I've spoken to about using JCR has been around
their need to get away from using filesystems to store data. Many
organisations see RDBMS support as a key feature of jackrabbit et al.,
and JCR as a mechanism to allow them to make best use of their existing
investment in Oracle, MS SQL Server, etc. while still providing a
content-oriented interface. So, jackrabbit starts to find a place as a
webdav to RDBMS interface layer, for example. 

Miro

-----Original Message-----
From: Serge Huber [mailto:shuber2@jahia.com] 
Sent: 03 February 2006 10:35
To: jackrabbit-dev@incubator.apache.org
Subject: Re: DP Persistence manager implementation


Just on a quick note, from what I understand of Jackrabbit's main 
design, is that it is optimized for usage scenarios where you :

read-a-lot, write-a-little

Or rather that your read to write ratio is large. In your scenario it 
seems that you are writing quite often or is this a misconception ?

Apart from that, I do agree that adding too much logic to a "Simple" DB 
PM would not be a good idea. We could indeed add more DB PMs, possibly 
without copy-pasting DB code. A modification to allow for JNDI lookup 
though would not be significant.

As for thread-writing and such, you have to be careful with transaction 
semantics, which *must* be preserved. I think one interesting thing to 
look into might be optimistic locking, where all the locking and object 
changes checks are made at the end of a transaction, with the 
possibility of the transaction to be rejected, but of course this 
depends on the usage scenario and whether this is acceptable or not.

Regards,
  Serge Huber.

Przemyslaw Pakulski wrote:
> Marcel Reutegger <[EMAIL PROTECTED]> wrote:
>
>> this is not quite true. the actual store operation on the persistence

>> manager is synchronized. however most of the write calls from 
>> different threads to the JCR api in jackrabbit will not block each 
>> other because those changes are made in a private transient scope. 
>> only the final save or commit of the transaction is serialized. 
>> that's only one part of the whole write process
>
> But in fact in real application most of business method ends with 
> either save, checkin or commit, and in consequence concurrent calls of

> this methods will block each other and wait for storing modified data.
> We are using versioning feature intensively, and we have performance 
> problems mainly with write operations. Additionally we notice big 
> performance degradation when we switch to SimpleDBPersistenceManager 
> with MySQL or other db using network communications. So it looks like 
> overall performance depends much on PM implementation because all 
> save/checkin operations wait for PM until he finish all his work.
> One solution to avoid blocking write operations could be special 
> thread/s responsible for flushing data to PM, but i don't think so 
> that Jackrabbit uses asynchronous processing.
>
>> even if such a persistence manager allows concurrent writes, it is 
>> still the responsibility of the caller to ensure consistency. in our 
>> case that's the SharedItemStateManager. And that's the place where 
>> transactions are currently serialized, but only on commit.
>> If concurrent write performance should become a real issue that's 
>> where we first have to deal with it.
>
> If there exists any singleton component on top of PM, which is 
> reponsible for serializing all saves, checkins or transactions then 
> naturally using connection pools doesn't help, but maybe it means that

> Jackrabbit is not designed to work effectively in multithreaded 
> environment.
>
> Even if usage of connection pool is not reasonable in current design, 
> I think it is worth to consider JDBC batch updates instead of single 
> updates to gain better DBPM performance.
>
> Regards
> Przemo Pakulski
> www.cognifide.com
>
>


Mime
View raw message