couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eric casteleijn <eric.castele...@canonical.com>
Subject Re: _changes resource
Date Tue, 07 Jul 2009 07:43:40 GMT
Adam Kocoloski wrote:
> On Jul 6, 2009, at 3:58 PM, Chris Anderson wrote:
> 
>>> == Deleted and Conflicts==
>>>
>>> _all_docs_by_seq includes a 'deleted' flag and a list of 'conflicts'.
>>> Should the _changes API to do the same?
>>
>> The plan is to drive replication from changes, so anything needed by
>> replication is on the roadmap. I don't think it'd hurt to have any of
>> those but Damien would be better to answer this one.
> 
> The deleted=true flag probably won't be needed by the replicator, 
> because the _changes feed includes the deletion revid.  I expect that 
> the replicator will just download this revision like any other, find the 
> _deleted:true bit set in the document, and delete the document on the 
> target.

Note that replication is an important user of _changes, but by no means 
the only one, if update notifiers go away. (Which I think everyone 
agrees would be a good idea.) I would like to have the option to not 
only see when a document was deleted, but in addition when one was first 
created on the node in question, which in my application would require 
special action, over and above what needs to happen for an update to a 
document.

> _conflicts and _deleted_conflicts are more interesting.  When one of 
> these occurs, the document shows up in the _changes feed, but the 
> revision in that row is the latest revision of the document, not the 
> conflict/deleted_conflict rev.  Unlike _all_docs_by_seq, it's not 
> possible for the replicator to determine the list of revisions to 
> replicate solely by analyzing the _changes feed.
> 
> I think the most efficient solution is to start including conflict and 
> deleted_conflict revisions in the revlist in the _changes row. I don't 
> know the revision tree well enough to know if it's possible to identify 
> the set of all conflict revisions that were saved after update_sequence 
> N, but if it is that would be a neat restriction.

This sounds like a good idea.

> Another option might be to configure a metadata-only request so that the 
> replicator could check what revisions exist on the source for each 
> updated document.  Could be a useful thing to have in general.

And so does this.

What I would like to add:

Right now the _changes feeds are per db, and while that is great in some 
use cases (like replication.) In others, where there are many thousands 
of databases, one global _changes feed would be much more practical. It 
is also how the current update notifiers work, so not having this option 
would break existing applications, at least it would mine. ;)

One last thing that would be great to have is a way to configure what 
information goes into a particular _changes feed or an option to write 
your own _changes-like feeds in javascript, like you could a view, so 
that one could have a feed of changes to the values of a particular 
field, for instance. It would make it so that processes that act on 
updates from the db never have to query back into the db for additional 
data, which could be a performance win. (That is if the configurable 
_changes feeds aren't too much of a performance loss.)

I've also filed a feature request in JIRA with these suggestions:

https://issues.apache.org/jira/browse/COUCHDB-390

but more discussion, here or on that issue, is most welcome.

-- 
- eric casteleijn
http://www.canonical.com

Mime
View raw message