Return-Path: X-Original-To: apmail-jackrabbit-users-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2861510AE3 for ; Wed, 5 Jun 2013 08:59:40 +0000 (UTC) Received: (qmail 41553 invoked by uid 500); 5 Jun 2013 08:59:39 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 41258 invoked by uid 500); 5 Jun 2013 08:59:36 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 41236 invoked by uid 99); 5 Jun 2013 08:59:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Jun 2013 08:59:34 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lcheng.nj@gmail.com designates 209.85.160.67 as permitted sender) Received: from [209.85.160.67] (HELO mail-pb0-f67.google.com) (209.85.160.67) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Jun 2013 08:59:27 +0000 Received: by mail-pb0-f67.google.com with SMTP id rq2so632117pbb.10 for ; Wed, 05 Jun 2013 01:59:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=oT+hE1SIkRppiDuP2GuSeYbYirWLWJBlzBij3BjzLZI=; b=NHA2+f14jSKrto/D9KhQgLwKtDoZjn4j9AtMnUKyExdzR7ANELqqG5fu+gsYWgFNWe z3KtxjQqH0z5oj6yuISqBUensyB0HekuFFHzPdCrKxcrQ+rmQUH4TX8W0goHXOgfmbc6 WeGvmwklqK68ri6uJ5Od+MpTaojAACu2JizrmEUiZMDHpLqfirr4ClQp4acrtRZehaEG reVtmjXtYkGnb0aJYGy5g5EIkSNTHsg/exBDXiOSHUyfNk+GXWb3AGwzqGumumNBzRxk ZCBl+6A3oaVFf73GMVvgkQOMahjn/hUUPNpTS8cTrz3zYpTKyHl99Eqvt1NRFPRR/4Vk Cz+A== MIME-Version: 1.0 X-Received: by 10.68.1.226 with SMTP id 2mr3471941pbp.150.1370422746286; Wed, 05 Jun 2013 01:59:06 -0700 (PDT) Received: by 10.70.19.67 with HTTP; Wed, 5 Jun 2013 01:59:06 -0700 (PDT) In-Reply-To: References: Date: Wed, 5 Jun 2013 16:59:06 +0800 Message-ID: Subject: Re: about removing Old Revisions from journal table. From: liang cheng To: dev@jackrabbit.apache.org Cc: users@jackrabbit.apache.org Content-Type: multipart/alternative; boundary=bcaec5314817fc1abd04de646844 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec5314817fc1abd04de646844 Content-Type: text/plain; charset=ISO-8859-1 Could someone kindly give me some help? Thanks. Regards, -Liang 2013/5/29 liang cheng > Hi, all > In our production environment, the Jackrabbit Journal table would > become large (more than 100, 000 records) after running 2 weeks. As a > result, we plan to utilize the janitor thread to remove old revisions > mentioned in http://wiki.apache.org/jackrabbit/Clustering#Removing Old > Revisions. > After enabling it, there would be several caveats as mentioned in the > wiki page too. > 1. If the janitor is enabled then you loose the possibility to > easily add cluster nodes. (It is still possible but takes detailed > knowledge of Jackrabbit.) > 2. You must make sure that all cluster nodes have written their > local revision to the database before the clean-up task runs for the first > time because otherwise cluster nodes might miss updates (because they > have been purged) and their local caches and search-indexes get out of > sync. > 3. If a cluster node is removed permanently from the cluster, then > its entry in the LOCAL_REVISIONS table should be removed manually. > Otherwise, the clean-up thread will not be effective. > > I can understand point #3.But not quite sure about #1 and #2. > > #1 is our biggest concern. In our production environment, we have cases > to need add new cluster node(s), e.g. If system capacity could not handle > current workload, or if some running node needs to be stopped for some > while for maintenance and then new node needs to be added. In #1, you only > say that "you loose the possibility to easily add cluster nodes", but > doesn't give more explaination about the reason. As I know, when new node > is added into the JR cluster, there is no lucene index, then Jackrabbit > would build the index for the whole current repository nodes (build from > root node). After this step, Jackrabbit then process the revisions > generated by other nodes. *I wonder what's the possible issue when > processing old revisions with latest repository content in cache and > indexes? > * > > For #2, *does it mean any manual work needed to keep the consistency?* > > > > Although the wiki page give one approch to add new cluster node manually > (i.e. clone indexes and local revision number from existing node), we still > hope there is some safe programming way to avoid the manual work, because > our production is deployed in Amazon EC2 environment and adding new node > needs easily as much as possible. > > Could you please give some comments to my concerns? Thanks. > > > Regards, > > -Liang > > > > > > > > --bcaec5314817fc1abd04de646844--