jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gritsenko <va...@reverycodes.com>
Subject Re: Is JDBC persistence manager supported by jackrabbit?
Date Fri, 02 Sep 2005 12:45:09 GMT
Marcel Reutegger wrote:
> Vadim Gritsenko wrote:
>> Edgar Poce wrote:
>>> when I decided to write the jdbc pm proposed in jcr-91 I wanted:
>>> 1 - a mature, transactional and scalable persistence storage
>>> 2 - use rdbms administrative tools, like scheduled backups, etc.
>>> 3 - rdbms referential integrity
>>> 4 - avoid redundancy. PMs store the NodeReferences twice.
>>> 5 - a storage that allows to modify the data easily, just in case.
>> I need at least 1, 2, and clustering on top of that... None of 
>> existing PMs will work in cluster environment (OJB and Hibernate do 
>> not count).
> Please note that clustering Jackrabbit is not just about the persistence 
> manager. It also involves many other areas that we need to take care of.

I know. But having transactional clustered PM will enable me to create a cluster 
of Level 1 repository instances to run them on app servers. Next step can be 
enabling flushing/synchronization of caches on those Level 1 instances. And 
after all that is done, full clustering (with distributed locking, etc) will be 
easier to tackle.

> See: http://issues.apache.org/jira/browse/JCR-169 for a starting point 
> on discussions about this topic.

Thanks for the pointer.

>> Why wait release? :-) Isn't code in contrib meant to be grounds for 
>> experimental code? :-) Let's bring it up before that - SimpleDB isn't 
>> usable as well:
>>   * Synchronized to death
>>   * Stored BLOBs locally
> Feel free to provide patches to enhance concurrency.

My first patch than will be port of connection pools from Edgar's JDBC PM. Once 
DB PM has access to DB connection pool, there will be no need for any 
synchronizations. Would you accept it?

> Some enhancements that crossed my mind are:
> - use a separate read-only connection for load() and exists() operations
> - use a pool of prepared statements for load() and exists()

There are issues with single/double-connection design, beside the fact that 
(j2ee) applications are discouraged from managing system resources themselves:

   * No transaction isolation - which brings need for synchronizations
   * No keep-alive monitoring
   * No ability to reconnect severed connection

As for statement caching, IIRC driver does this.

> With those changes we can then loosen some of the synchronization.
> BLOBs are stored locally because many DBs are known for their bad 
> performance when it comes to handling streams. So, speaking of 
> enhancements, introducing a configuration choice for BLOB handling is 
> probably another one.

Locally stored BLOBs might be Ok for non-clustered environment. It might be even 
Ok in some cluster deployments, if there is a replication mechanism.

But I don't think it is a good idea to replicate full set of BLOBs over each 
server (multiple times - if server runs more than one webapp) which happen to 
have a need to access the repository. I prefer having all BLOBs in one place, 
even if it is a bit slower...


View raw message