jackrabbit-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Jackrabbit Wiki] Update of "PersistenceManagerFAQ" by edgarpoce
Date Thu, 09 Jun 2005 03:34:14 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" for change notification.

The following page has been changed by edgarpoce:
http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ

New page:
= PersistenceManager(PM) FAQ =
The responses were mainly gathered from the jackrabbit mailing list. 

=== What's a PM? ===
The PM is an *internal* Jackrabbit component that handle the persistent storage of content
nodes and properties. Each workspace of a Jackrabbit content repository uses a separate persistence
manager to store the content in that workspace. Also the Jackrabbit version handler uses a
separate persistence manager. The PM sits at the very bottom layer in jackrabbits system architecture.

Reliability, integrity and performance of the PM are *crucial* to the overall stability &
performance of the repository. If e.g. the data that a PM is based upon is allowed to change
through external means the integrity of the repository would be at risk (think of referential
integrity / node references e.g.).

=== What's the PM responsibility? ===
The PM interface was never intended as being a general SPI that you could implement in order
to integrate external datasources with proprietary formats (e.g. a customers database). the
reason why we abstracted the PM interface was to leave room for future performance optimizations
that  would not affect the rest of the implementation (e.g. by storing the raw data in a b-tree
based database instead of individual file).

=== How smart should be a PM? ===
A PM should not be 'intelligent', it should not 'interpret' the data. The only thing it should
care about is to efficiently, consistently and reliably store and read the data encapsulated
in the passed nodeState & propertyState objects. Though it might be feasible to write
a custom persistence manager to represent existing legacy data in a level-1 (read-only) repository,
I don't think the same is possible for a level-2 repository and i certainly would not recommend
it.

=== What about ORM-backed PMs? ===
Persistence managers that store the item states in a complex schema are not the right way
to go. Keep it simple, e.g. the objectPersistenceManager stores the item states as a raw stream
of bytes.

=== What combination of FS and PM is the best choice? ===
It depends on your priorities. If you want to store your data in an accessible format (just
in case ;), you might want to try XML PM + localFileSystem. If you use windows and performance
is a must, you might want to try objectPersistenceManager + cqfs.

=== Which are the current options? What are the status, pros and cons of each implementation?
===

=== objectPersistenceManager ===
 * Status: mature
 * Simple
 * Not human readable
 * An inconsistency is hard to fix without a tool
 * easy to configure
 * Write operations are synchronized 
 * if the jvm process is killed the repository might turn inconsistent
 * non transactional

=== xml persistenceManager ===
 * Status: mature
 * not so simple but human readable
 * easy to configure
 * Write operations are synchronized 
 * if the jvm process is killed the repository might turn inconsistent
 * non transactional

=== ORM persistenceManagers ===
 * Status: work in progress
 * Unnecessary complexity
 * transactional
 * rdbms referencial integrity (possible, but not implemented yet)
 * not so easy to configure.
 * Multithreaded friendly. Write operations don't need to be synchronized. 

=== localFileSystem: ===
 * Status: mature
 * Slow on window boxes

=== CQFS file system ===
 * Status: mature
 * Mysterious configuration options ;)
 * Mysterious proprietary binary format ;)
 * fast on windows
 * license issue, it's proprietary

Mime
View raw message