jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-1138) Data store garbage collection
Date Tue, 18 Sep 2007 13:07:43 GMT

    [ https://issues.apache.org/jira/browse/JCR-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12528377

Thomas Mueller commented on JCR-1138:

To better support garbage collection for the data store, I suggest to add a new method to

     * Get all node ids. 
     * A typical application will call this method multiple times, where 'after'
     * is the last row read. The maxCount parameter defines the maximum number of 
     * node ids returned, 0 meaning no limit. The order of the node ids is specific for the

     * given persistent manager. Items that are added concurrently may not be included.
     * @param after the lower limit, or null for no limit.
     * @param maxCount the maximum number of node ids to return, or 0 for no limit.
     * @return an iterator of all bundles.
     * @throws ItemStateException if an error while loading occurs.
    public abstract NodeIdIterator getAllNodeIds(NodeId after, int maxCount)
            throws ItemStateException;

Only for the Bundle PersistenceManagers, because those persistence managers are the most important
ones (in my view).

This method is then called from the garbage collection process (or from a background thread
from time to time, with a low maxCount and with enough sleep time in between). After all nodes
are processed, the objects in the data store that were never scanned are deleted. This mechanism
is better than the current mechanism as it can be restarted: only the last visited node needs
to be persisted. It is also more efficient as the persistence manager can return the data
in the order it is stored (which is easy for BundleFsPersistenceManager).

What do you think, is this approach OK? 

> Data store garbage collection
> -----------------------------
>                 Key: JCR-1138
>                 URL: https://issues.apache.org/jira/browse/JCR-1138
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: core
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
> Currently the data store garbage collection needs to be run manually. It should be simpler
to use (maybe tool based), or automatic.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message