jackrabbit-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Jackrabbit Wiki] Update of "DataStore" by DarrenHartford
Date Fri, 17 Jul 2009 15:21:24 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" for change notification.

The following page has been changed by DarrenHartford:

   * Fulltext search and meta data extraction could be done when storing the object (only
once per object) and stored next to the object. 
   * Client should first send the checksum and size of large objects when they store something
(import, adding or updating data), in many cases the actual data does not need to be sent.
   * Speed up garbage collection. One idea is to use 'back references' for larger objects:
each larger object would know the set of nodes that reference it. This would be an 'append
only' set, that means at runtime links only added, not removed. Only the garbage collection
process removes links. The garbage collection would first update links for large objects (this
process could stop at the first link that still exists). That way large objects can be removed
quickly if they are not used any more. Afterwards, objects with a low use count should be
scanned. This algorithm wouldn't necessarily speed up the total garbage collection time, but
it would free up space more quickly.
+  * Compressed Datastore (file, db) - if the content type and size make it likely to have
large disk space savings if compressed, set the datastore to auto-compress (whether zip, gzip,
bz, etc.).  File datastore is more likely to have this feature than DB.  (user added 7/17/2009)

View raw message