Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EFFFB10FB1 for ; Fri, 5 Jul 2013 11:52:47 +0000 (UTC) Received: (qmail 51320 invoked by uid 500); 5 Jul 2013 11:52:46 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 51091 invoked by uid 500); 5 Jul 2013 11:52:45 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 51083 invoked by uid 99); 5 Jul 2013 11:52:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Jul 2013 11:52:44 +0000 X-ASF-Spam-Status: No, hits=0.2 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of stemail23@gmail.com designates 209.85.192.170 as permitted sender) Received: from [209.85.192.170] (HELO mail-pd0-f170.google.com) (209.85.192.170) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Jul 2013 11:52:40 +0000 Received: by mail-pd0-f170.google.com with SMTP id x11so1948693pdj.29 for ; Fri, 05 Jul 2013 04:52:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:from:mime-version:in-reply-to:date:message-id:subject:to :content-type; bh=dBRNI32cMIMrnrp4cC7momZgJgOVHDFrti/leJH4kac=; b=jpc8jtX4Wzicuw5a85ntTGzEt4PfdexIoqgn67PY1HhAYzoLSTAs/JZXSa0KU8Djma HbLXI+nal3TDX0NvbMqjZmlELmdo2PPDQTofy+FKthRCuzx73r7rTduBFpIuYf/0kEu+ MF7qzhEg+FtzSIsXCpT8dFjZxpWK3PmV8oYd+IwFwT3DSsq9dM0gaZm+D2qWs9gSrQ2s C0j++FEQ/dVCx1sWR5CJW0B6WQbDqLYLWRio/5ZNW4FeJcYTmYhR+Qp1YwpBU5qA+fSd RWPhyD7GkAWzZRILQ0Ap84GFeSLVd4mr1Hcn6wmcPh2gL4CFAJ/KvR6KgX0RGTifuUq1 ZI9A== X-Received: by 10.66.150.40 with SMTP id uf8mr10918923pab.66.1373025139861; Fri, 05 Jul 2013 04:52:19 -0700 (PDT) References: <4452E2B7937E5944BDA554F6A6D5EB363DDA6218@abn-exch1b.green.sophos> From: Steven Barlow Mime-Version: 1.0 (1.0) In-Reply-To: Date: Fri, 5 Jul 2013 21:52:19 +1000 Message-ID: <2230614989236312791@unknownmsgid> Subject: Re: Purging documents and view invalidation To: "user@couchdb.apache.org" Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Sorry if this is a tangent, but I wanted to pick up on the "rarely used in the wild" thread: I personally intend to use purge, because I have temporary partial (filtered) replications of a "master" database at remote sites. When the data has been consumed by the remote site, I figured I could purge it (to save space). Is this not a valid, or common use case for purging? On 05/07/2013, at 7:21 PM, Jason Smith wrote: > I slightly disagree with Bob, but he is right that all purge buys you (vs. > filtered replication and then swapping DBs) is a little bit of uptime. > Purge is not "untested" but it is rarely used in the wild, so the > cost/benefit for your uptime is something between "risky" and "unknown." > > (For me, personally, I would purge.) > > > On Fri, Jul 5, 2013 at 3:31 PM, Robert Newson wrote: > >> Paul, >> >> If you replicate this database to another database and use a filter >> that blocks deleted documents, the target will not contain a trace of >> your 100 million deletes (that is, you can build a new database >> without cruft without messing with your existing database). During the >> replication, you can query the view on the target to build it >> incrementally, or wait till the end, query it once and wait for >> completion. At the end, flip your app to look at the new database >> instead. >> >> The _purge feature is really only for the case where you accidentally >> write your root password down in a document id or something (since >> compaction will sweep away old document contents). I advise against >> using it for any other reason. >> >> B. >> >> >> >> >> On 5 July 2013 09:17, Jason Smith wrote: >>> Hi, Paul. I wrote up some thoughts on purging here: >>> https://github.com/iriscouch/cqs#purging-couchdb >>> >>> Note, that procedure is untested. It works as a thought experiment only. >>> >>> The procedure looks complicated, but all you will need is the core purge, >>> view, purge, view, etc. cadence as described in Damien's email I linked >> to. >>> As long as you never purge twice before hitting the view, you are fine. >>> Again, to my knowledge, the purge code is less well tested than other >> parts >>> of CouchDB, so perhaps copy your .couch file and try with that until you >>> are confident. >>> >>> >>> On Fri, Jul 5, 2013 at 2:37 PM, Paul Hirst >> wrote: >>> >>>> I would like to purge a few (~100 million) documents from my database. >>>> I've been going through deleting them all, and that'll be complete in >> the >>>> next few days but I would like to free up some extra space by purging >> them >>>> also. >>>> >>>> My concern is around a comment on the wiki page here >>>> http://wiki.apache.org/couchdb/Purge_Documents >>>> >>>> 'If you have purged more than one document between querying your views, >>>> you will find that they will rebuild from scratch.' >>>> >>>> Since I have already deleted the documents I know they aren't showing up >>>> in the view any longer. Is there any way I can avoid this view >>>> invalidation? (My views take about 10 days to build from scratch so I >> can't >>>> afford the hit). >>>> >>>> I have a replica of the database. I could do the purge on the replica, >>>> wait for the view to rebuild, switch over, purge on the original db, >> wait >>>> for the view, switch back, unless there are any obvious problems with >> this >>>> approach? >>>> >>>> Cheers, >>>> Paul >>>> >>>> ________________________________ >>>> >>>> Sophos Limited, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, >>>> United Kingdom. >>>> Company Reg No 2096520. VAT Reg No GB 991 2418 08. >>