jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Langley <dige...@gmail.com>
Subject Re: example of running the Garbage Collector?
Date Tue, 03 Apr 2012 12:05:03 GMT
Thanks Alex, however I was hoping for something different.

The approach used in the test code is to create a separate transient
Which something we've been doing for a while, when we create this
repository we make it join as part of a cluster, which of course means
it needs to index all the files, effects the journal table etc.

What I was hoping for is something we could just run in the same
environment as our primary jackrabbit instance, i.e. with the same
repository that the RepositoryAccessServlet uses. In fact, I'd love to
get the repository in that same way.

Here's some code I tried, but it never did the garbage collection,
even though it "seems" to work, i.e. runs w/out failure.

	protected String servletRepositoryGC() {
		String retVal = "fail.";
                 Session session = null;
                 DataStoreGarbageCollector garbageCollector = null;		
                 Repository repository =
                 try {
                     // Credentials to create a valid session for the
user to access the
                     // repository's DataStoreGarbageCollector.
                 	String userName = "admin";
                 	String password = "somepassword";
                     session = repository.login(new
SimpleCredentials(userName, password.toCharArray()));
                     garbageCollector =

                     logger.info(">>>> HACK >>>> Mark Phase Is Starting.");

                     logger.info(">>>> HACK >>>> Mark Phase Is
Complete. Sweep Phase Is Starting.");

                     logger.info(">>>> HACK >>>> Sweep Phase Is Complete.");
                     retVal = "ok.";
                 } catch (Exception e) {
                     logger.severe(">>>> HACK >>>> Exception while
Garbage Collection" + e.toString());
                 } finally {
                 return retVal;

The measure of true garbage collection for us is select count(*) on
the DATASTORE table in the RDBMS (mysql in our case). We can see the
count go down when GC runs correctly, which it does if we start the
transient repository as described earlier.

So... how ~do~ people with production Jackrabbit servers do Garbage
Collection? It seems like there are only 3 options:
1) shutdown your repository and use the technique that the unit test
2) leave your repository running, but add a transient repository as a
cluster member (means your primary JR instance must run in cluster
mode too). This means you can do this as a separate process and run
with a cronjob or on demand.
3) Find a way to run it in process and a timer thread with the same
repository definition as the main JR instance. This is the option I'm
looking for.

Thanks everyone! I think everyone who does a fair number of writes
will benefit from this. For our application we do LOTS of writes, so
GC is essential.

-- Langley
From: Alex Parvulescu [alex.parvulescu@gmail.com]
Sent: Monday, April 02, 2012 3:42 PM
To: users@jackrabbit.apache.org
Subject: Re: example of running the Garbage Collector?

Hi John,

This could get you started:

On Mon, Apr 2, 2012 at 12:24 PM, John Langley <digerat@gmail.com> wrote:
> Does anyone have a simple example of running the GarbageCollector in
> the same process as their repository?
> It seems like this is a best practice, but I don't see any of examples of it.
> If it matters we're running 2.5.5.
> Thanks in advance!
> -- Langley

View raw message