Return-Path: X-Original-To: apmail-jackrabbit-users-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0AE66D20 for ; Thu, 26 May 2011 12:10:13 +0000 (UTC) Received: (qmail 87584 invoked by uid 500); 26 May 2011 12:10:12 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 87554 invoked by uid 500); 26 May 2011 12:10:12 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 87546 invoked by uid 99); 26 May 2011 12:10:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 May 2011 12:10:12 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mueller@adobe.com designates 64.18.1.181 as permitted sender) Received: from [64.18.1.181] (HELO exprod6og101.obsmtp.com) (64.18.1.181) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 May 2011 12:10:04 +0000 Received: from outbound-smtp-1.corp.adobe.com ([192.150.11.134]) by exprod6ob101.postini.com ([64.18.5.12]) with SMTP ID DSNKTd5DB9HcHtIculp8435Ojezhzb6PKNIQ@postini.com; Thu, 26 May 2011 05:09:44 PDT Received: from inner-relay-4.eur.adobe.com (inner-relay-4.adobe.com [193.104.215.14]) by outbound-smtp-1.corp.adobe.com (8.12.10/8.12.10) with ESMTP id p4QC8dES022980 for ; Thu, 26 May 2011 05:08:40 -0700 (PDT) Received: from nacas02.corp.adobe.com (nacas02.corp.adobe.com [10.8.189.100]) by inner-relay-4.eur.adobe.com (8.12.10/8.12.9) with ESMTP id p4QC9cqL007122 for ; Thu, 26 May 2011 05:09:40 -0700 (PDT) Received: from eurcas01.eur.adobe.com (10.128.4.27) by nacas02.corp.adobe.com (10.8.189.100) with Microsoft SMTP Server (TLS) id 8.3.159.3; Thu, 26 May 2011 05:09:39 -0700 Received: from eurmbx01.eur.adobe.com ([10.128.4.32]) by eurcas01.eur.adobe.com ([10.128.4.27]) with mapi; Thu, 26 May 2011 13:09:37 +0100 From: Thomas Mueller To: "users@jackrabbit.apache.org" Date: Thu, 26 May 2011 13:09:33 +0100 Subject: Re: clustered garbage collection Thread-Topic: clustered garbage collection Thread-Index: Acwbncf5BOQ49CcgSaO7YG3jh2Mw2A== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.10.0.110310 acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Hi, The way garbage collection works, I don't see a potential problem if you run garbage collection concurrently. When garbage collection is running, each file that is accessed is 'touched' (the last modified time is changed to the current time). If you run it concurrently, this still will happen. At the end of the GC, old files (untouched files) are deleted. So it shouldn't be a problem. Of course I would avoid to run it concurrently, because it's enough to run it on one cluster node (it's simply a waste of time to run it concurrently). Regards, Thomas On 5/26/11 1:22 PM, "John Langley" wrote: >First off, thanks to writers of this great little description of how to do >garbage collection and Fabian for pointing it out. >http://wiki.apache.org/jackrabbit/DataStore#Data_Store_Garbage_Collection > >My next question concerns running garbage collection in a cluster. If had >a >number of identical nodes running in a cluster, each of them periodically >running a garbage collection task, where the periods may overlap... say >nodes 1 starts and then in the middle of either the mark or the sweep, >node >2 starts it's mark or perhaps even overlaps it's sweep.... what will >the consequences be? Will they "collide", i.e. will their be unexpected >errors (explicit exception based errors) or mis-behaviors (implicit >non-identified errors)? > >Of course, the alternative is to guarantee that only one node in the >cluster >is responsible for the periodic mark and sweep. > >Thanks in advance for any pointers or insights. This community has been >GREAT at responding to questions with very helpful solutions and bug >fixes. > >-- Langley