Return-Path: Delivered-To: apmail-jackrabbit-commits-archive@www.apache.org Received: (qmail 94339 invoked from network); 7 Dec 2009 08:43:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Dec 2009 08:43:25 -0000 Received: (qmail 28658 invoked by uid 500); 7 Dec 2009 08:43:25 -0000 Delivered-To: apmail-jackrabbit-commits-archive@jackrabbit.apache.org Received: (qmail 28570 invoked by uid 500); 7 Dec 2009 08:43:25 -0000 Mailing-List: contact commits-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list commits@jackrabbit.apache.org Received: (qmail 28561 invoked by uid 99); 7 Dec 2009 08:43:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Dec 2009 08:43:24 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Dec 2009 08:43:22 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 1F38317816; Mon, 7 Dec 2009 08:43:02 +0000 (GMT) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Mon, 07 Dec 2009 08:43:02 -0000 Message-ID: <20091207084302.25164.83800@eos.apache.org> Subject: =?utf-8?q?=5BJackrabbit_Wiki=5D_Update_of_=22PersistenceManagerFAQ=22_by_?= =?utf-8?q?ThomasMueller?= Dear Wiki user, You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" fo= r change notification. The "PersistenceManagerFAQ" page has been changed by ThomasMueller. http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ?action=3Ddiff&rev1= =3D33&rev2=3D34 -------------------------------------------------- - =3D Persistence Manager (PM) FAQ =3D + =3D=3D Persistence Manager =3D=3D = + <> + = + =3D=3D Overview =3D=3D + = - =3D=3D What is a Persistence Manager (PM)? =3D=3D + =3D=3D=3D What Is a Persistence Manager (PM)? =3D=3D=3D The PM is an *internal* Jackrabbit component that handles the persistent = storage of content nodes and properties. Property values are also stored in= the persistence manager, with the exception of large binary values (those = are usually kept in the DataStore). = Each workspace of a Jackrabbit content repository uses a separate persist= ence manager to store the content in that workspace. Also the Jackrabbit ve= rsion handler uses a separate persistence manager. = @@ -26, +30 @@ The PM sits at the very bottom layer in jackrabbits system architecture. = Reliability, integrity and performance of the PM are *crucial* to the ove= rall stability & performance of the repository. If e.g. the data that a PM = is based upon is allowed to change through external means the integrity of = the repository would be at risk (think of referential integrity / node refe= rences e.g.). = - =3D=3D=3D=3D Which Persistence Manager is the fastest? =3D=3D=3D=3D + =3D=3D=3D Which Persistence Manager Is the Fastest? =3D=3D=3D The bundle persistence managers are usually the fastest. Bundle persisten= ce managers store each node together with all the properties as one unit. L= arge binary properties are stored to the BLOBStore by default (or DataStore= if configured). Setting the minimum blob size for bundle persistence manag= ers very high decreases the performance. = Storing the data in the file system does not require a database. Dependin= g on the file system and database, database persistence managers are someti= mes slower and sometimes faster than the Bundle``Fs``Persistence``Manager. = When using a database, please note that embedded Java databases do not have= network overhead. = - =3D=3D=3D=3D Consistency / Atomicy =3D=3D=3D=3D + =3D=3D=3D Consistency / Atomicy =3D=3D=3D = The database persistence managers are atomic if the database is atomic. = The current file based persistence managers are not always atomic. They d= o support transactions in Jackrabbit, the exception is after a crash: When = the process is stopped while a transaction is being written to disk (power = failure, process killed, Runtime.halt() called, VM crash), some data of a t= ransaction may be committed and some not. Theoretically, some nodes may be = corrupt (depending how and when the system crashed). The algorithms used ar= e minimizing this risk, for example the parent node is written last so in m= ost cases there is no problem even after a crash. = = - =3D=3D=3D=3D What's the PM responsibility? =3D=3D=3D=3D + =3D=3D=3D What's the PM Responsibility? =3D=3D=3D The PM interface was never intended as being a general SPI that you could= implement in order to integrate external datasources with proprietary form= ats (e.g. a customers database). the reason why we abstracted the PM interf= ace was to leave room for future performance optimizations that would not = affect the rest of the implementation (e.g. by storing the raw data in a b-= tree based database instead of individual file). = - =3D=3D=3D=3D How smart should a PM be? =3D=3D=3D=3D + =3D=3D=3D How Smart Should a PM Be? =3D=3D=3D A PM should not be 'intelligent'. It should not 'interpret' the data. The= only thing a PM should care about is to efficiently, consistently and reli= ably store and read the data encapsulated in the passed nodeState & propert= yState objects. Though it might be feasible to write a custom persistence m= anager to represent existing legacy data in a level-1 (read-only) repositor= y, I don't think the same is possible for a level-2 repository. At a minim= um, it certainly would not be recommended. = - =3D=3D=3D=3D File System (FS) =3D=3D=3D=3D + =3D=3D=3D File System (FS) =3D=3D=3D = Jackrabbit uses the org.apache.jackrabbit.core.fs.FileSystem interface as= a file system abstraction. Although this interface does not cover all dire= ct file system use of Jackrabbit, it still allows for flexibility in select= ing where and how to store various parts of the repository. For example, be= cause it is possible with Jackrabbit to configure separate file systems for= different system components (e.g., global repository state, workspaces, se= arch indexes, versioning, etc), it might make sense to store the search ind= exes on a fast disk and the archived node versions on a slower disk. = = - =3D=3D=3D=3D What combination of FS and PM is the best choice? =3D=3D=3D= =3D + =3D=3D=3D What Combination of FS and PM is the Best Choice? =3D=3D=3D It depends on your priorities. If you want to store your data in a RDBMS,= use Bundle``Db``Persistence``Manager in conjunction with either a Local``F= ile``System or Db``File``System. If you want to store your data in an more = readily accessible format (just in case ;), you might want to try an XML``P= ersistence``Manager paired with a Local``File``System. = =3D=3D Available Implementations =3D=3D = - =3D=3D=3D=3D Bundle Database PM =3D=3D=3D=3D + =3D=3D=3D Bundle Database PM =3D=3D=3D * Status: mature (the default persistence manager) * Depending on the database, one of the following: * org.apache.jackrabbit.core.persistence.bundle.Derby``Persistence``Ma= nager (Apache Derby; Java) @@ -69, +73 @@ * The tables are automatically created. To create them manually, see [[M= anuallyCreatingDatabaseTables]]. * [[http://jackrabbit.apache.org/api/1.5/org/apache/jackrabbit/core/pers= istence/bundle/BundleDbPersistenceManager.html|BundleDbPersistenceManager]] = - =3D=3D=3D=3D Bundle File-System PM =3D=3D=3D=3D + =3D=3D=3D Bundle File-System PM =3D=3D=3D * Status: mature * If the JVM process is killed the repository might turn inconsistent * Not meant to be used in production environments (except for read-only = uses) @@ -77, +81 @@ * Very fast if used with DataStore or BLOBStore * [[http://jackrabbit.apache.org/api/1.4/org/apache/jackrabbit/core/pers= istence/bundle/BundleFsPersistenceManager.html|BundleFsPersistenceManager]] = - =3D=3D=3D=3D In-Memory PM =3D=3D=3D=3D + =3D=3D=3D In-Memory PM =3D=3D=3D * Status: mature * All data is lost as soon as the repository is closed * org.apache.jackrabbit.core.persistence.mem.In``Mem``Persistence``Manag= er @@ -87, +91 @@ * Very fast * [[http://jackrabbit.apache.org/api/1.4/org/apache/jackrabbit/core/pers= istence/mem/InMemPersistenceManager.html|InMemPersistenceManager]] = - =3D=3D=3D=3D Simple Database PM =3D=3D=3D=3D + =3D=3D=3D Simple Database PM =3D=3D=3D * Status: mature * Subclasses of org.apache.jackrabbit.core.persistence.db.Simple``Db``Pe= rsistence``Manager * JDBC based; zero-deployment: schema is automatically created @@ -95, +99 @@ * Fast * [[http://jackrabbit.apache.org/api/1.4/org/apache/jackrabbit/core/pers= istence/db/SimpleDbPersistenceManager.html|SimpleDbPersistenceManager]] = - =3D=3D=3D=3D ObjectPersistenceManager =3D=3D=3D=3D + =3D=3D=3D ObjectPersistenceManager =3D=3D=3D * Status: obsolete, mature * If the JVM process is killed the repository might turn inconsistent * Not meant to be used in production environments = * Persists data in an abstract File``System using a simple binary serial= ization format = - =3D=3D=3D=3D XMLPersistenceManager =3D=3D=3D=3D + =3D=3D=3D XMLPersistenceManager =3D=3D=3D * Status: obsolete, mature * If the JVM process is killed the repository might turn inconsistent * Persists data in an abstract File``System using XML serialization form= at = - =3D=3D=3D=3D ORMPersistenceManager =3D=3D=3D=3D + =3D=3D=3D ORMPersistenceManager =3D=3D=3D * Status: obsolete, experimental & unfinished, still being maintained? * Referential integrity is possible, but not implemented * Not so easy to configure. = - =3D=3D=3D=3D LocalFileSystem: =3D=3D=3D=3D + =3D=3D=3D LocalFileSystem: =3D=3D=3D * Status: mature * Slow on window boxes = - =3D=3D=3D=3D MemoryFileSystem: =3D=3D=3D=3D + =3D=3D=3D MemoryFileSystem: =3D=3D=3D * Status: mature * All data is lost as soon as the repository is closed * For testing and small (read-only) workspaces = * Keeps all content in memory * Very fast = - =3D=3D=3D=3D DbFileSystem: =3D=3D=3D=3D + =3D=3D=3D DbFileSystem: =3D=3D=3D * Status: mature * Atomic * Meant to be used in combination with a Database Persistence Manager as= repository & workspace file system=20