On 19/08/2008, at 1:31 PM, Chris Anderson wrote: > From what I understand about the replication process, the client won't > have any trouble receiving a subset of the full replication. Perhaps it would be a good idea for Someone Who Knows to write out a brief summary of exactly what goes on with document lifecycle and replication. I also have an (incomplete, assumed, probably wrong) understanding of how I *think* it works - I've read the technical overview but there's still a lot of black box stuff going on. For example, where is CouchDB storing the list of "documents updated since last replication"? Is that list generated by a push replication as well? What if you push-replicate to more than one remote server, is another list created for that server? How are servers identified - do they have IDs? etc etc .. Seems a few people would like to know more about the internals of replication. http://wiki.apache.org/couchdb/ConfiguringDistributedSystems whet my appetite but it stops just as it's getting good! > However, > it's worth looking out for what happens if you try to replicate again > later (perhaps from a different view of the same database). Could > _revs get out of sync? I am also curious about that - does replication grab only the latest _rev, or all of them (if extant)? If the status is deleted, does it bother replicating it? What happens if the local DB is compacted and deleted docs expunged before replication - are they deleted/retained on the remote? If someone knows of an existing document with the answers to these questions, please direct me to it, and forgive my delinquency. > This feature has been on the roadmap for a while, so maybe Damien has > some ideas for how it should be designed. Well, in the Technical Overview at http://incubator.apache.org/couchdb/docs/overview.html it says: "Partial replicas can be created and maintained. Replication can be filtered by a javascript function, so that only particular documents or those meeting specific criteria are replicated. This can allow users to take subsets of a large shared database application offline for their own use, while maintaining normal interaction with the application and that subset of data." I think that's why I thought it actually DID work, thought I later realised that document is more of a "features CouchDB will have in the future" rather than a description of current functionality. Regardless it's obviously it's been intended to work like that for some time so I imagine a lot of that design work has already been done and it's just a matter of finding the time to finish it : ) > -- > Chris Anderson > http://jchris.mfdz.com