couchdb-replication mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <adam.kocolo...@gmail.com>
Subject Re: Checkpointing on read only databases
Date Tue, 15 Apr 2014 20:07:49 GMT
I think it'd make sense to only include the leaf revs.

It's annoying that an update to an old document blows away such a big chunk of the Merkel
tree (which makes recovery from a mismatch harder), but it does have the advantage over the
rolling hash of only requiring extra space in the inner btree nodes in couch_btree. Not sure
how easily one could later a Merkle tree on top of other replicating systems (this is replication@,
after all).

Adam

> On Apr 15, 2014, at 3:57 PM, Chris Anderson <jchris@couchbase.com> wrote:
> 
> I think compaction preserves old rev ids and sequences, but rev-stemming
> could result in a mismatch unless only the leaf revs are hashed in the
> merkle reduction.
> 
>> On Tuesday, April 15, 2014, Calvin Metcalf <calvin.metcalf@gmail.com> wrote:
>> 
>> won't compaction make that tricky to calculate retroactively?
>> 
>> 
>> On Tue, Apr 15, 2014 at 3:10 PM, Chris Anderson <jchris@couchbase.com<javascript:;>
>>> wrote:
>> 
>>> If you want to know if checkpoints are the same, maybe a combination of
>> the
>>> sequence number and a merkle tree of document revision ids would work? It
>>> would require adding a reduction to the by sequence tree, but you'd be
>> able
>>> to know if two sequences also refer to the same content. Eg is the source
>>> database the same as you talked to last, or just a new one with the same
>>> sequence number.
>>> 
>>> Chris
>>> 
>>> 
>>> On Tue, Apr 15, 2014 at 11:54 AM, Calvin Metcalf
>>> <calvin.metcalf@gmail.com>wrote:
>>> 
>>>> I think the problem is not as much deleting and recreating a database
>> but
>>>> wiping a virtual machine and restoring from a backup, now you have more
>>> or
>>>> less gone back in time with the target database and it has different
>>> stuff
>>>> but the same uuid.
>>>> 
>>>> 
>>>>> On Tue, Apr 15, 2014 at 2:32 PM, Dale Harvey <dale@arandomurl.com>
>>>> wrote:
>>>> 
>>>>> I dont understand the problem with per db uuids, so the uuid isnt
>>>>> multivalued nor is it queried
>>>>> 
>>>>>   A is readyonly, B is client, B starts replication from A
>>>>>   B reads the db uuid from A / itself, generates a replication_id,
>>>> stores
>>>>> on B
>>>>>   try to fetch replication checkpoint, if successful we query
>> changes
>>>> from
>>>>> since?
>>>>> 
>>>>> In pouch we store the uuid along with the data, so file based backups
>>>> arent
>>>>> a problem, seems couchdb could / should do that too
>>>>> 
>>>>> This also fixes the problem mentioned on the mailing list, and one I
>>> have
>>>>> run into personally where people forward db requests but not server
>>>>> requests via a proxy
>>>>> 
>>>>> 
>>>>> On 15 April 2014 19:18, Calvin Metcalf <calvin.metcalf@gmail.com>
>>> wrote:
>>>>> 
>>>>>> except there is no way to calculate that from outside the database
>> as
>>>>>> changes only ever gives the more recent document version.
>>>>>> 
>>>>>> 
>>>>>> On Sun, Apr 13, 2014 at 9:47 PM, Calvin Metcalf <
>>>>> calvin.metcalf@gmail.com
>>>>>>> wrote:
>>>>>> 
>>>>>>> oo didn't think of that, yeah uuids wouldn't hurt, though the
>> more
>>> I
>>>>>> think
>>>>>>> about the rolling hashing on revs, the more I like that
>>>>>>> 
>>>>>>> 
>>>>>>> On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski <
>>>>>> adam.kocoloski@gmail.com>wrote:
>>>>>>> 
>>>>>>>> Yes, but then sysadmins have to be very very careful about
>>> restoring
>>>>>> from
>>>>>>>> a file-based backup. We run the risk that {uuid, seq} could
be
>>>>>>>> multi-valued, which diminishes its value considerably.
>>>>>>>> 
>>>>>>>> I like the UUID in general -- we've added them to our internal
>>> shard
>>>>>>>> files at Cloudant -- but on their own they're not a bulletproof
>>>>> solution
>>>>>>>> for read-only incremental replications.
>>>>>>>> 
>>>>>>>> Adam
>>>>>>>> 
>>>>>>>>> On Apr 13, 2014, at 5:16 PM, Calvin Metcalf <
>>>>> calvin.metcalf@gmail.com
>>>>>>> 
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> I mean if your going to add new features to couch you
could
>> just
>>>>> have
>>>>>>>> the
>>>>>>>>> db generate a random uuid on creation that would be different
>> if
>>>> it
>>>>>> was
>>>>>>>>> deleted and recreated
>> --
>> -Calvin W. Metcalf
> 
> 
> -- 
> —
> Chris Anderson  @jchris
> http://www.couchbase.com

Mime
View raw message