couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dale Harvey <d...@arandomurl.com>
Subject Re: Which revisions does _revs_limit prune?
Date Sat, 31 Aug 2013 18:13:57 GMT
So the way I implemented this in PouchDB gives Paul Davis's advice, is to
stem the trees but turning the revision tree into a set of revision lists,
listing all the individual paths from head to root, then stemming each list
to the rev limit, then merging them back into a single tree

I worked off a refactor branch of couch that never got merged, but from a
glance it looks like this is how it is done in
https://github.com/apache/couchdb/blob/master/src/couchdb/couch_key_tree.erl

I keep meaning to write a test to reproduce this, but I am fairly certain
this has the problem with a document that is generating a lot of conflicts
(by being deleted and recreated continuously), can dos CouchDB as shallow
branches never get pruned, but I may possibly be missing something



On 31 August 2013 19:05, Robert Newson <rnewson@apache.org> wrote:

> The best I can find right now is from couch_key_tree where the
> truncation occurs;
>
> %% What makes this a bit more complicated is that there is a limit to the
> %% number of revisions kept, specified in couch_db.hrl (default is 1000).
> When
> %% this limit is exceeded only the last 1000 are kept. This comes in to
> play
> %% when branches are merged. The comparison has to begin at the same place
> in
> %% the branches. A revision id is of the form N-XXXXXXX where N is the
> current
> %% revision. So each path will have a start number, calculated in
> %% couch_doc:to_path using the formula N - length(RevIds) + 1 So, .eg. if
> a doc
> %% was edit 1003 times this start number would be 4, indicating that 3
> %% revisions were truncated.
> %%
> %% This comes into play in @see merge_at/3 which recursively walks down one
> %% tree or the other until they begin at the same revision.
>
>
> On 31 August 2013 19:02, Jens Alfke <jens@couchbase.com> wrote:
> > The only description I can find about revs_limit is "the maximum number
> of document revisions that will be tracked by CouchDB, even after
> compaction has occurred." Nothing I've been able to find online says which
> revisions are thrown out to reach this limit — it could be the oldest ones,
> or the ones most deeply buried, for example.
> >
> > I’m guessing it’s most likely the oldest [earliest added] revisions, but
> it’s not always clear what those are. For example, if a document with a big
> rev tree gets replicated into this database, all of its revisions are the
> same age as far as the local db is concerned, because they all got added in
> the same PUT operation.
> >
> > Anyone know for sure?
> >
> > —Jens
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message