Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D503010BCD for ; Fri, 5 Jul 2013 09:21:58 +0000 (UTC) Received: (qmail 66160 invoked by uid 500); 5 Jul 2013 09:21:56 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 66089 invoked by uid 500); 5 Jul 2013 09:21:55 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 66070 invoked by uid 99); 5 Jul 2013 09:21:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Jul 2013 09:21:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jason.h.smith@gmail.com designates 209.85.214.173 as permitted sender) Received: from [209.85.214.173] (HELO mail-ob0-f173.google.com) (209.85.214.173) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Jul 2013 09:21:48 +0000 Received: by mail-ob0-f173.google.com with SMTP id wc20so2594925obb.4 for ; Fri, 05 Jul 2013 02:21:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=E5Z7sBhbeHVzPsSwMis0WpYymaBbwPEiEsU+OJ8+h2g=; b=BFTvmFVak7vPrJzlnihbjelARn1fghV2Ur4mdXIeJCO150aeko9uYaMf+2yk+OZNLS PrkGvP2YAUjewk1QP9gKoPhG1JibIzmaH2lKnnxnCexk1BMEO5G3Jxf8CAJhlItplAwd 2/lbY555AnfeYhlRdztZoK41G1sDjrpmZRIP8YBWsRwRwfaqPkypvOH7VCZASPyPibqY FSdpNVNJG1xma85fvz56YAz8SvqbYjYhU5HrTIYRjBGBZCz0cas4qvZKXbN2f6l/hIWZ gKO06c9pUlhWoQAD+8qerl+KV0z/UQH2WZEmWeFwfvVVjTX4+IdwhVQ/c/h+Hb/CbKcu 9fzA== X-Received: by 10.60.102.41 with SMTP id fl9mr10014199oeb.37.1373016087776; Fri, 05 Jul 2013 02:21:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.103.133 with HTTP; Fri, 5 Jul 2013 02:21:06 -0700 (PDT) In-Reply-To: References: <4452E2B7937E5944BDA554F6A6D5EB363DDA6218@abn-exch1b.green.sophos> From: Jason Smith Date: Fri, 5 Jul 2013 16:21:06 +0700 Message-ID: Subject: Re: Purging documents and view invalidation To: user@couchdb.apache.org Content-Type: multipart/alternative; boundary=089e0111dd442ed8b704e0c038dd X-Virus-Checked: Checked by ClamAV on apache.org --089e0111dd442ed8b704e0c038dd Content-Type: text/plain; charset=UTF-8 I slightly disagree with Bob, but he is right that all purge buys you (vs. filtered replication and then swapping DBs) is a little bit of uptime. Purge is not "untested" but it is rarely used in the wild, so the cost/benefit for your uptime is something between "risky" and "unknown." (For me, personally, I would purge.) On Fri, Jul 5, 2013 at 3:31 PM, Robert Newson wrote: > Paul, > > If you replicate this database to another database and use a filter > that blocks deleted documents, the target will not contain a trace of > your 100 million deletes (that is, you can build a new database > without cruft without messing with your existing database). During the > replication, you can query the view on the target to build it > incrementally, or wait till the end, query it once and wait for > completion. At the end, flip your app to look at the new database > instead. > > The _purge feature is really only for the case where you accidentally > write your root password down in a document id or something (since > compaction will sweep away old document contents). I advise against > using it for any other reason. > > B. > > > > > On 5 July 2013 09:17, Jason Smith wrote: > > Hi, Paul. I wrote up some thoughts on purging here: > > https://github.com/iriscouch/cqs#purging-couchdb > > > > Note, that procedure is untested. It works as a thought experiment only. > > > > The procedure looks complicated, but all you will need is the core purge, > > view, purge, view, etc. cadence as described in Damien's email I linked > to. > > As long as you never purge twice before hitting the view, you are fine. > > Again, to my knowledge, the purge code is less well tested than other > parts > > of CouchDB, so perhaps copy your .couch file and try with that until you > > are confident. > > > > > > On Fri, Jul 5, 2013 at 2:37 PM, Paul Hirst > wrote: > > > >> I would like to purge a few (~100 million) documents from my database. > >> I've been going through deleting them all, and that'll be complete in > the > >> next few days but I would like to free up some extra space by purging > them > >> also. > >> > >> My concern is around a comment on the wiki page here > >> http://wiki.apache.org/couchdb/Purge_Documents > >> > >> 'If you have purged more than one document between querying your views, > >> you will find that they will rebuild from scratch.' > >> > >> Since I have already deleted the documents I know they aren't showing up > >> in the view any longer. Is there any way I can avoid this view > >> invalidation? (My views take about 10 days to build from scratch so I > can't > >> afford the hit). > >> > >> I have a replica of the database. I could do the purge on the replica, > >> wait for the view to rebuild, switch over, purge on the original db, > wait > >> for the view, switch back, unless there are any obvious problems with > this > >> approach? > >> > >> Cheers, > >> Paul > >> > >> ________________________________ > >> > >> Sophos Limited, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, > >> United Kingdom. > >> Company Reg No 2096520. VAT Reg No GB 991 2418 08. > >> > --089e0111dd442ed8b704e0c038dd--