jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Boston <...@tfd.co.uk>
Subject Re: Jackrabbit 1.2.3, RecordInput/DatabaseJournal
Date Fri, 27 Apr 2007 19:42:22 GMT


We have over the past 3-4 years moved away from database persistence for 
the body of files since a number of Universities have 1TB of data or more.

I have no problem putting the metadata in the DB, but if we put the 
bodies in as well the DBA's throw a fit, is just about bearable for 
Oracle although shifting backups starts to become a problem, but we have 
seen some interesting results when a few 100G goes into a MySQL db under 
innodb, not least query times.

So the question becomes, how bad transactionally is having a DB based 
PersistanceManager and content (the BLOBS) on the filesystem?

I might be getting confused at this point, and confusing you with my 
lack of knowledge and terminology.... so in my Workspace definition I am 

             <param name="schema" value="${db.dialect}"/>
             <param name="schemaObjectPrefix" value="jcr_${wsp.name}_"/>
             <param name="externalBLOBs" value="${content.filesystem}"/>

Where SakaiPersistanceManager simple overrides the getConnection() 
method of the standard DB persistence manager.

The DB is a standalone mysql, or Oracle instance.

Any pointers would be extremely helpful.


Dominique Pfister wrote:
> Hi Ian,
> On 4/27/07, Ian Boston <ieb@tfd.co.uk> wrote:
>> One quick question, which parts of the repository filesystem {rep.home}
>> should be in shared space and local space on the cluster node, I'm using
>> content on filesystem.
> In a clustered environment, using content on filesystem is not
> recommended: since the journal does only contain the modified item's
> id, but not the content itself, all nodes have to save the content in
> the same location. Changes made by one node in the cluster should be
> isolated from other nodes until the change is actually committed, a
> condition the filesystem based persistence managers do not fulfill.
> I'd rather take a database based persistence manager, where the
> database is running standalone and not embedded. If you already use
> the DatabaseJournal with a JDBC datasource, it would probably make
> sense to use the same database to save your repository data.
> Kind regards
> Dominique
>> Ian
>> Dominique Pfister wrote:
>> > Hi Ian,
>> >
>> > On 4/27/07, Ian Boston <ieb@tfd.co.uk> wrote:
>> >> I want to extend reimplemente DatabaseJounal (core.cluster), but there
>> >> is a dependency on RecordInput which is a protected class (or at least
>> >> default scope).
>> >>
>> >> So you cant extend AbstractDatabaseJournal except in the same package
>> >> (perhaps thats the answer)
>> >>
>> >> Is there a reason for this, or was it an oversight.
>> >
>> > This is definitely an oversight. Ideally, DatabaseJournal should have
>> > a protected method named "getConnection", that may be overridden to
>> > change the way a connection is acquired. I will file a bug for this.
>> >
>> >> The reason I want to extend as I am embedding Jackrabbit into Sakai
>> >> (www.sakaiproject.org) and I would prefer to use a DataSource rather
>> >> DriverManager delivered connection .... even if I get the 
>> connection and
>> >> keep it.
>> >
>> > For the time being, if using a DataSource is an absolute must, there
>> > is nothing else I can suggest than checking out the source code from
>> > svn, applying the required changes directly to your local copy and
>> > building a new, customized version.
>> >
>> > Kind regards
>> > Dominique

View raw message