jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Boston <...@tfd.co.uk>
Subject Re: Jackrabbit 1.2.3, RecordInput/DatabaseJournal
Date Fri, 04 May 2007 22:11:49 GMT
Dominique Pfister wrote:
> Hi Ian,
> 
> On 4/27/07, Ian Boston <ieb@tfd.co.uk> wrote:
>>
>> Dominique,
>>
>> We have over the past 3-4 years moved away from database persistence for
>> the body of files since a number of Universities have 1TB of data or 
>> more.
>>
>> I have no problem putting the metadata in the DB, but if we put the
>> bodies in as well the DBA's throw a fit, is just about bearable for
>> Oracle although shifting backups starts to become a problem, but we have
>> seen some interesting results when a few 100G goes into a MySQL db under
>> innodb, not least query times.
>>
>> So the question becomes, how bad transactionally is having a DB based
>> PersistanceManager and content (the BLOBS) on the filesystem?
> 
> Quite bad. The DbBLOBStore, that will store blobs in the DB, uses the
> same underlying JDBC connection and will therefore atomically save all
> other changes along with the blobs. The FileSystemBLOBStore on the
> other hand does only fulfill the D(urability) of the ACID transaction
> properties, e.g. it could save some blob even in case the database
> operation fails and would therefore break consistency. IMHO, the time
> needed to make it transaction-safe is considerable.


would Commons transaction work for this ?

http://jakarta.apache.org/commons/transaction/file/index.html

Im happy to look into the FileSystemBLOBStore.

> 
>> Any pointers would be extremely helpful.
> 
> There might be DB-specific extensions that tell the DB to store large
> files externally in the file system (e.g. Oracle's BFILE) but that
> would imply coding some custom database persistence manager that knows
> how to deal with this situation. Not sure, whether those extensions
> still work in a transactional information, though.

We moved away from putting raw content in to the DB (non JSR-170 store) 
a few years ago when DBA's reported lots of problems in production. If 
it can be avoided I'd prefer not to go back there.

Ian

> 
> Kind regards
> Dominique
> 
>>
>>
>> Dominique Pfister wrote:
>> > Hi Ian,
>> >
>> > On 4/27/07, Ian Boston <ieb@tfd.co.uk> wrote:
>> >> One quick question, which parts of the repository filesystem 
>> {rep.home}
>> >> should be in shared space and local space on the cluster node, I'm 
>> using
>> >> content on filesystem.
>> >
>> > In a clustered environment, using content on filesystem is not
>> > recommended: since the journal does only contain the modified item's
>> > id, but not the content itself, all nodes have to save the content in
>> > the same location. Changes made by one node in the cluster should be
>> > isolated from other nodes until the change is actually committed, a
>> > condition the filesystem based persistence managers do not fulfill.
>> >
>> > I'd rather take a database based persistence manager, where the
>> > database is running standalone and not embedded. If you already use
>> > the DatabaseJournal with a JDBC datasource, it would probably make
>> > sense to use the same database to save your repository data.
>> >
>> > Kind regards
>> > Dominique
>> >
>> >>
>> >>
>> >> Ian
>> >>
>> >> Dominique Pfister wrote:
>> >> > Hi Ian,
>> >> >
>> >> > On 4/27/07, Ian Boston <ieb@tfd.co.uk> wrote:
>> >> >> I want to extend reimplemente DatabaseJounal (core.cluster), but

>> there
>> >> >> is a dependency on RecordInput which is a protected class (or at

>> least
>> >> >> default scope).
>> >> >>
>> >> >> So you cant extend AbstractDatabaseJournal except in the same 
>> package
>> >> >> (perhaps thats the answer)
>> >> >>
>> >> >> Is there a reason for this, or was it an oversight.
>> >> >
>> >> > This is definitely an oversight. Ideally, DatabaseJournal should 
>> have
>> >> > a protected method named "getConnection", that may be overridden to
>> >> > change the way a connection is acquired. I will file a bug for this.
>> >> >
>> >> >> The reason I want to extend as I am embedding Jackrabbit into Sakai
>> >> >> (www.sakaiproject.org) and I would prefer to use a DataSource 
>> rather
>> >> >> DriverManager delivered connection .... even if I get the
>> >> connection and
>> >> >> keep it.
>> >> >
>> >> > For the time being, if using a DataSource is an absolute must, there
>> >> > is nothing else I can suggest than checking out the source code from
>> >> > svn, applying the required changes directly to your local copy and
>> >> > building a new, customized version.
>> >> >
>> >> > Kind regards
>> >> > Dominique
>> >>
>> >>
>>
>>


Mime
View raw message