jackrabbit-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Jackrabbit Wiki] Update of "DataStore" by ThomasMueller
Date Thu, 13 Sep 2007 15:24:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" for change notification.

The following page has been changed by ThomasMueller:
http://wiki.apache.org/jackrabbit/DataStore

------------------------------------------------------------------------------
  
  To use the File┬┤Data┬┤Store, add this to your repository.xml after the <Repository>
start tag:
  
+ {{{
-     <DataStore class="org.apache.jackrabbit.core.data.File``Data``Store"/> 
+     <DataStore class="org.apache.jackrabbit.core.data.FileDataStore"/> 
+ }}}
  
  == Additional configuration options ==
  
  This is a full configuration using the default values:
  
+ {{{
-     <DataStore class="org.apache.jackrabbit.core.data.File``Data``Store">
+     <DataStore class="org.apache.jackrabbit.core.data.FileDataStore">
          <param name="path" value="${rep.home}/repository/datastore"/>
          <param name="minRecordLength" value="100"/>
-     </Data``Store>
+     </DataStore>
+ }}}
  
- == Clustering ==
+ == FAQ ==
  
  Clustering is supported if you use a clustered file system. You need to set data store path
of all cluster nodes to the same location.
  
+ Transaction: transactional semantics are guaranteed.
+ 
+ There is only one data store per repository (not one per Workspace).
+ 
+ Backup: It is very easy to backup the data store: just copy all files. They are never modified,
and only renamed from temp file to live file. Deleted only when no longer used (and only by
the garbage collector). Backup can be incremental.
+ 
  == How does it work ==
  
- When adding a binary object, Jackrabbit checks the size of it. When it is larger than minRecordLength,
it is added to the data store, otherwise it is kept in-memory. This is done very early (possible
when calling Property.setValue(stream)). Only the unique data identifier is stored in the
persistence manager (except for in-memory objects, where the data is stored). When updating
a value, the old value is kept there an the new value is added (there is no update operation).
+ When adding a binary object, Jackrabbit checks the size of it. When it is larger than minRecordLength,
it is added to the data store, otherwise it is kept in-memory. This is done very early (possible
when calling Property.setValue(stream)). Only the unique data identifier is stored in the
persistence manager (except for in-memory objects, where the data is stored). When updating
a value, the old value is kept there (potentially becoming garbage) an the new value is added.
There is no update operation.
  
  The current implementation still stores temporary files in some situations, for example
in the RMI client. Those cases will be changed to use the data store directly where it makes
sense.
  
@@ -37, +47 @@

  
  New implementations are welcome! Cool would be a S3 data store (http://en.wikipedia.org/wiki/Amazon_S3).
Maybe somebody needs a database data store. A caching data store would be great as well (items
that are used a lot are stored in fast file system, others in a slower one).
  
+ == Future ideas ==
+ 
+ Theoretically the data store could be split to different directories / hard drives. Content
that is accessed more often could be moved to a faster disk, and less used data could eventually
be moved to slower / cheaper disk. That would be an extension of the 'memory hierarchy' (see
also http://en.wikipedia.org/wiki/Memory_hierarchy). Of course this wouldn't limit the space
used per workspace, but would improve system performance if done right. Maybe we need to do
that anyway in the near future to better support solid state disk.
+ 

Mime
View raw message