couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <jason.h.sm...@gmail.com>
Subject Re: Purging documents and view invalidation
Date Fri, 05 Jul 2013 15:23:12 GMT
If you do that, and you re-run replication (or potentially if you use
continuous replication) then those documents will be re-replicated back to
the remote site. Purging is as if the document was never created at all. So
when replication runs, the couches will want to copy it from the "master"
source.


On Fri, Jul 5, 2013 at 8:12 PM, Steven Barlow <stemail23@gmail.com> wrote:

> Purged at the remote site. The master always contains the complete
> data set, the remote sites replicate partial data sets for their
> immediate needs, and then clean themselves up once the tasks are
> complete.
>
> On 05/07/2013, at 9:57 PM, Jason Smith <jason.h.smith@gmail.com> wrote:
>
> > On which database will you perform the purging?
> >
> >
> > On Fri, Jul 5, 2013 at 6:52 PM, Steven Barlow <stemail23@gmail.com>
> wrote:
> >
> >> Sorry if this is a tangent, but I wanted to pick up on the "rarely
> >> used in the wild" thread: I personally intend to use purge, because I
> >> have temporary partial (filtered) replications of a "master" database
> >> at remote sites. When the data has been consumed by the remote site, I
> >> figured I could purge it (to save space). Is this not a valid, or
> >> common use case for purging?
> >>
> >> On 05/07/2013, at 7:21 PM, Jason Smith <jason.h.smith@gmail.com> wrote:
> >>
> >>> I slightly disagree with Bob, but he is right that all purge buys you
> >> (vs.
> >>> filtered replication and then swapping DBs) is a little bit of uptime.
> >>> Purge is not "untested" but it is rarely used in the wild, so the
> >>> cost/benefit for your uptime is something between "risky" and
> "unknown."
> >>>
> >>> (For me, personally, I would purge.)
> >>>
> >>>
> >>> On Fri, Jul 5, 2013 at 3:31 PM, Robert Newson <rnewson@apache.org>
> >> wrote:
> >>>
> >>>> Paul,
> >>>>
> >>>> If you replicate this database to another database and use a filter
> >>>> that blocks deleted documents, the target will not contain a trace of
> >>>> your 100 million deletes (that is, you can build a new database
> >>>> without cruft without messing with your existing database). During the
> >>>> replication, you can query the view on the target to build it
> >>>> incrementally, or wait till the end, query it once and wait for
> >>>> completion. At the end, flip your app to look at the new database
> >>>> instead.
> >>>>
> >>>> The _purge feature is really only for the case where you accidentally
> >>>> write your root password down in a document id or something (since
> >>>> compaction will sweep away old document contents). I advise against
> >>>> using it for any other reason.
> >>>>
> >>>> B.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 5 July 2013 09:17, Jason Smith <jhs@apache.org> wrote:
> >>>>> Hi, Paul. I wrote up some thoughts on purging here:
> >>>>> https://github.com/iriscouch/cqs#purging-couchdb
> >>>>>
> >>>>> Note, that procedure is untested. It works as a thought experiment
> >> only.
> >>>>>
> >>>>> The procedure looks complicated, but all you will need is the core
> >> purge,
> >>>>> view, purge, view, etc. cadence as described in Damien's email I
> linked
> >>>> to.
> >>>>> As long as you never purge twice before hitting the view, you are
> fine.
> >>>>> Again, to my knowledge, the purge code is less well tested than
other
> >>>> parts
> >>>>> of CouchDB, so perhaps copy your .couch file and try with that until
> >> you
> >>>>> are confident.
> >>>>>
> >>>>>
> >>>>> On Fri, Jul 5, 2013 at 2:37 PM, Paul Hirst <paul.hirst@sophos.com>
> >>>> wrote:
> >>>>>
> >>>>>> I would like to purge a few (~100 million) documents from my
> database.
> >>>>>> I've been going through deleting them all, and that'll be complete
> in
> >>>> the
> >>>>>> next few days but I would like to free up some extra space by
> purging
> >>>> them
> >>>>>> also.
> >>>>>>
> >>>>>> My concern is around a comment on the wiki page here
> >>>>>> http://wiki.apache.org/couchdb/Purge_Documents
> >>>>>>
> >>>>>> 'If you have purged more than one document between querying
your
> >> views,
> >>>>>> you will find that they will rebuild from scratch.'
> >>>>>>
> >>>>>> Since I have already deleted the documents I know they aren't
> showing
> >> up
> >>>>>> in the view any longer. Is there any way I can avoid this view
> >>>>>> invalidation? (My views take about 10 days to build from scratch
so
> I
> >>>> can't
> >>>>>> afford the hit).
> >>>>>>
> >>>>>> I have a replica of the database. I could do the purge on the
> replica,
> >>>>>> wait for the view to rebuild, switch over, purge on the original
db,
> >>>> wait
> >>>>>> for the view, switch back, unless there are any obvious problems
> with
> >>>> this
> >>>>>> approach?
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Paul
> >>>>>>
> >>>>>> ________________________________
> >>>>>>
> >>>>>> Sophos Limited, The Pentagon, Abingdon Science Park, Abingdon,
OX14
> >> 3YP,
> >>>>>> United Kingdom.
> >>>>>> Company Reg No 2096520. VAT Reg No GB 991 2418 08.
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message