couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Santner <dansant...@me.com>
Subject Re: Anyway to alter doc when replicating?
Date Thu, 06 Feb 2014 19:18:27 GMT
I like that idea Riyad.  I'll give it a shot.  Thanks.
On Feb 6, 2014, at 1:04 PM, Riyad Kalla <rkalla@gmail.com> wrote:

> Dan, I wonder if you would be better serviced by creating a View in your
> original DB that does all the needed manipulation to the docs and code up
> some form of manual replication where you take all the results from that
> view and copy them into your target data source?
> 
> You wouldn't be able to use the built-in CouchDB replication, but at least
> you would have total control over the data leaving your master source (it
> sounds like in your case masking PII/sensitive data before it leaves is
> important, so this step might be handy).
> 
> 
> On Thu, Feb 6, 2014 at 11:06 AM, Jens Alfke <jens@couchbase.com> wrote:
> 
>> 
>> On Feb 6, 2014, at 9:38 AM, Dan Santner <dansantner@me.com> wrote:
>> 
>>> I have the replication filtering down now but I'm wondering is there
>> anyway for me to change the doc before it copies to the source?
>> 
>> Well, to take your question literally, you can of course change the
>> documents on the original database before starting the replication. Only
>> the latest revisions (with the redacted names) will be transferred.
>> 
>> But I think you're asking for some kind of filter that would alter
>> documents while they're being replicated? I don't think that's feasible.
>> The document's revision ID is tied to its contents (it's based on a SHA-1
>> digest of the JSON) and you can't change the contents while leaving the
>> revision ID the same. But changing the rev ID in the middle of replication
>> would be really problematic because the replicator is transferring specific
>> revisions by their revIDs, and it would confuse it if it got a different
>> revID than the one it asked for.
>> 
>>> The use case is I have production documents that I want to migrate
>> somewhere else but change all the names to 'John Smith' before they land in
>> the new destination.  Also need to remove a couple other things that might
>> be considered sensitive.
>> 
>> The only good option I can think of is to keep the sensitive parts of the
>> data in separate documents. (The main doc would have a property that
>> contains the doc ID of the sensitive data.) Then you can run a filtered
>> replication that sends the regular documents but not the sensitive ones.
>> 
>> --Jens


Mime
View raw message