Return-Path: Delivered-To: apmail-jackrabbit-commits-archive@www.apache.org Received: (qmail 67649 invoked from network); 13 Sep 2007 15:25:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Sep 2007 15:25:14 -0000 Received: (qmail 58910 invoked by uid 500); 13 Sep 2007 15:25:07 -0000 Delivered-To: apmail-jackrabbit-commits-archive@jackrabbit.apache.org Received: (qmail 58883 invoked by uid 500); 13 Sep 2007 15:25:07 -0000 Mailing-List: contact commits-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list commits@jackrabbit.apache.org Received: (qmail 58874 invoked by uid 99); 13 Sep 2007 15:25:07 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Sep 2007 08:25:07 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Sep 2007 15:26:41 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 7717B59A07 for ; Thu, 13 Sep 2007 15:24:39 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Apache Wiki To: commits@jackrabbit.apache.org Date: Thu, 13 Sep 2007 15:24:39 -0000 Message-ID: <20070913152439.9987.12389@eos.apache.org> Subject: [Jackrabbit Wiki] Update of "DataStore" by ThomasMueller X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" for change notification. The following page has been changed by ThomasMueller: http://wiki.apache.org/jackrabbit/DataStore ------------------------------------------------------------------------------ To use the File´Data´Store, add this to your repository.xml after the start tag: + {{{ - + + }}} == Additional configuration options == This is a full configuration using the default values: + {{{ - + - + + }}} - == Clustering == + == FAQ == Clustering is supported if you use a clustered file system. You need to set data store path of all cluster nodes to the same location. + Transaction: transactional semantics are guaranteed. + + There is only one data store per repository (not one per Workspace). + + Backup: It is very easy to backup the data store: just copy all files. They are never modified, and only renamed from temp file to live file. Deleted only when no longer used (and only by the garbage collector). Backup can be incremental. + == How does it work == - When adding a binary object, Jackrabbit checks the size of it. When it is larger than minRecordLength, it is added to the data store, otherwise it is kept in-memory. This is done very early (possible when calling Property.setValue(stream)). Only the unique data identifier is stored in the persistence manager (except for in-memory objects, where the data is stored). When updating a value, the old value is kept there an the new value is added (there is no update operation). + When adding a binary object, Jackrabbit checks the size of it. When it is larger than minRecordLength, it is added to the data store, otherwise it is kept in-memory. This is done very early (possible when calling Property.setValue(stream)). Only the unique data identifier is stored in the persistence manager (except for in-memory objects, where the data is stored). When updating a value, the old value is kept there (potentially becoming garbage) an the new value is added. There is no update operation. The current implementation still stores temporary files in some situations, for example in the RMI client. Those cases will be changed to use the data store directly where it makes sense. @@ -37, +47 @@ New implementations are welcome! Cool would be a S3 data store (http://en.wikipedia.org/wiki/Amazon_S3). Maybe somebody needs a database data store. A caching data store would be great as well (items that are used a lot are stored in fast file system, others in a slower one). + == Future ideas == + + Theoretically the data store could be split to different directories / hard drives. Content that is accessed more often could be moved to a faster disk, and less used data could eventually be moved to slower / cheaper disk. That would be an extension of the 'memory hierarchy' (see also http://en.wikipedia.org/wiki/Memory_hierarchy). Of course this wouldn't limit the space used per workspace, but would improve system performance if done right. Maybe we need to do that anyway in the near future to better support solid state disk. +