jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Langley <langleyatw...@gmail.com>
Subject Re: clustered garbage collection
Date Tue, 31 May 2011 18:14:01 GMT
Thank you... yes it seems we have a simple solution to do the gc on a single
node using just a TransientRepository that's configured to use a slightly
specialized version of the repository.xml. The plan is to have it run as
simple standalone app triggered by a cronjob.

The respository.xml is specialized only in that it has a unique cluster id
(which of course it needs) and  a datasource with concrete information in it
rather than the jndi based one that all the other cluster participants use
because they are appserver based.

Thanks again,

-- Langley

On Thu, May 26, 2011 at 8:09 AM, Thomas Mueller <mueller@adobe.com> wrote:

> Hi,
>
> The way garbage collection works, I don't see a potential problem if you
> run garbage collection concurrently.
>
> When garbage collection is running, each file that is accessed is
> 'touched' (the last modified time is changed to the current time). If you
> run it concurrently, this still will happen. At the end of the GC, old
> files (untouched files) are deleted.
>
> So it shouldn't be a problem. Of course I would avoid to run it
> concurrently, because it's enough to run it on one cluster node (it's
> simply a waste of time to run it concurrently).
>
> Regards,
> Thomas
>
>
> On 5/26/11 1:22 PM, "John Langley" <langleyatwork@gmail.com> wrote:
>
> >First off, thanks to writers of this great little description of how to do
> >garbage collection and Fabian for pointing it out.
> >http://wiki.apache.org/jackrabbit/DataStore#Data_Store_Garbage_Collection
> >
> >My next question concerns running garbage collection in a cluster. If had
> >a
> >number of identical nodes running in a cluster, each of them periodically
> >running a garbage collection task, where the periods may overlap... say
> >nodes 1 starts and then in the middle of either the mark or the sweep,
> >node
> >2 starts it's mark or perhaps even overlaps it's sweep.... what will
> >the consequences be? Will they "collide", i.e. will their be unexpected
> >errors (explicit exception based errors) or mis-behaviors (implicit
> >non-identified errors)?
> >
> >Of course, the alternative is to guarantee that only one node in the
> >cluster
> >is responsible for the periodic mark and sweep.
> >
> >Thanks in advance for any pointers or insights. This community has been
> >GREAT at responding to questions with very helpful solutions and bug
> >fixes.
> >
> >-- Langley
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message