jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miro Walker" <miro.wal...@cognifide.com>
Subject DP Persistence manager implementation
Date Thu, 02 Feb 2006 10:38:31 GMT

We've been discussing the DB PM implementation, and have a couple of
questions regarding the implementation of this. At the moment, the
Simple DB PM appears to have been implemented using a single connection
with all write operations synchronised on a single object. This would
imply that all writes to the database are single threaded, effectively
making any application using it also run single threaded for write
operations. This appears to have two implications:

1. Performance - in a multi-user system, having single-threaded writes
to the database will make the JDBC connection a serious bottleneck as
soon as the application comes under load. It also means that any
background processing that needs to iterate over the repository making
changes (and we have a few of those) will effectively bring all other
users to a grinding halt. 

2. Transactions - we haven't tested this (as the recent support for
transactions in versioning operations has not been integrated into our
system), but it appears that to if a single connection is being used,
then we can only have a single transaction active at any one time. So,
if each user tries to execute a transaction with multiple write
operations in it, and these transactions are to be propagated through to
the database, then each transaction must complete before the next can
begin. This would either mean we get exceptions if the system attempts
to interleave operations from different transactions or that each
transaction must complete in full before another can begin, further
compounding the performance issue.

In addition to the implications of using a single synchronised
connection, another issue appears to be that the system will be unable
to recover from a connection failure. For example, if the system were
deployed onto a highly available database cluster, then in the event of
DB instance failure, any open connections will be killed, but can quite
happily be reopened later. Jackrabbit appears to create a connection on
initialisation, and has no way to recover if that connection is killed.

I know that questions around implementing support for connection pooling
on the DB have been raised before and then dismissed as unimportant, but
this appears to me to be pretty fundamental. By using a connection pool
implementation that supports recreating dead connections and supports
providing tying a connection to a transaction context, multiple
transactions could run in parallel, helping throughput and making the
system more reliable.

What do people think? Could we look to use Jakarta commons dbcp?



View raw message