couchdb-replication mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: Checkpointing on read only databases
Date Wed, 23 Apr 2014 17:27:48 GMT
No, that one just keeps the last N. For internal replication checkpoints in a cluster Paul
and I worked out something a bit smarter and more subtle:

https://github.com/cloudant/mem3/blob/master/src/mem3_rpc.erl#L152-L201

Cheers, Adam

On Apr 23, 2014, at 11:58 AM, Calvin Metcalf <calvin.metcalf@gmail.com> wrote:

> this function ?
> https://github.com/cloudant/bigcouch/blob/master/apps/couch/src/couch_rep.erl#L687-L781
> 
> 
> On Wed, Apr 23, 2014 at 11:35 AM, Adam Kocoloski
> <adam.kocoloski@gmail.com>wrote:
> 
>> There's an algorithm in the BigCouch codebase for storing up to N
>> checkpoints with exponentially increasing granularity (in terms of sequence
>> values) between them. It strikes a nice balance between checkpoint document
>> size and ability to resume with minimal replay.
>> 
>> Adam
>> 
>>> On Apr 23, 2014, at 11:28 AM, Calvin Metcalf <calvin.metcalf@gmail.com>
>> wrote:
>>> 
>>> with the rolling hash thingy, a checkpoint document could store more then
>>> one database hash, e.g. the last 5 but totally up to whoever is storing
>>> the checkpoint.  This would cover the case where you stop the replication
>>> after one of the dbs has stored the checkpoint but before the other one
>>> has.
>>> 
>>> 
>>>> On Tue, Apr 15, 2014 at 9:21 PM, Dale Harvey <dale@arandomurl.com>
>> wrote:
>>>> 
>>>> ah, yeh got it now, cheers
>>>> 
>>>> 
>>>>> On 16 April 2014 02:17, Calvin Metcalf <calvin.metcalf@gmail.com>
>> wrote:
>>>>> 
>>>>> Your source data base is upto seq 10, but the box its on catches fire.
>>>> You
>>>>> have a backup though but its at seq 8, same UUID though but you'll miss
>>>> the
>>>>> next 2 seqs.
>>>>>> On Apr 15, 2014 8:57 PM, "Dale Harvey" <dale@arandomurl.com>
wrote:
>>>>>> 
>>>>>> Sorry still dont understand the problem here
>>>>>> 
>>>>>> The uuid is stored inside the database file, you either have the
same
>>>>> data
>>>>>> and the same uuid, or none of them?
>>>>>> 
>>>>>> 
>>>>>> On 15 April 2014 19:54, Calvin Metcalf <calvin.metcalf@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>>> I think the problem is not as much deleting and recreating a
database
>>>>> but
>>>>>>> wiping a virtual machine and restoring from a backup, now you
have
>>>> more
>>>>>> or
>>>>>>> less gone back in time with the target database and it has different
>>>>>> stuff
>>>>>>> but the same uuid.
>>>>>>> 
>>>>>>> 
>>>>>>>> On Tue, Apr 15, 2014 at 2:32 PM, Dale Harvey <dale@arandomurl.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I dont understand the problem with per db uuids, so the uuid
isnt
>>>>>>>> multivalued nor is it queried
>>>>>>>> 
>>>>>>>>  A is readyonly, B is client, B starts replication from A
>>>>>>>>  B reads the db uuid from A / itself, generates a replication_id,
>>>>>>> stores
>>>>>>>> on B
>>>>>>>>  try to fetch replication checkpoint, if successful we query
>>>>> changes
>>>>>>> from
>>>>>>>> since?
>>>>>>>> 
>>>>>>>> In pouch we store the uuid along with the data, so file based
>>>> backups
>>>>>>> arent
>>>>>>>> a problem, seems couchdb could / should do that too
>>>>>>>> 
>>>>>>>> This also fixes the problem mentioned on the mailing list,
and one
>>>> I
>>>>>> have
>>>>>>>> run into personally where people forward db requests but
not server
>>>>>>>> requests via a proxy
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 15 April 2014 19:18, Calvin Metcalf <calvin.metcalf@gmail.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> except there is no way to calculate that from outside
the
>>>> database
>>>>> as
>>>>>>>>> changes only ever gives the more recent document version.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Sun, Apr 13, 2014 at 9:47 PM, Calvin Metcalf <
>>>>>>>> calvin.metcalf@gmail.com
>>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> oo didn't think of that, yeah uuids wouldn't hurt,
though the
>>>>> more
>>>>>> I
>>>>>>>>> think
>>>>>>>>>> about the rolling hashing on revs, the more I like
that
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski <
>>>>>>>>> adam.kocoloski@gmail.com>wrote:
>>>>>>>>>> 
>>>>>>>>>>> Yes, but then sysadmins have to be very very
careful about
>>>>>> restoring
>>>>>>>>> from
>>>>>>>>>>> a file-based backup. We run the risk that {uuid,
seq} could be
>>>>>>>>>>> multi-valued, which diminishes its value considerably.
>>>>>>>>>>> 
>>>>>>>>>>> I like the UUID in general -- we've added them
to our internal
>>>>>> shard
>>>>>>>>>>> files at Cloudant -- but on their own they're
not a
>>>> bulletproof
>>>>>>>> solution
>>>>>>>>>>> for read-only incremental replications.
>>>>>>>>>>> 
>>>>>>>>>>> Adam
>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 13, 2014, at 5:16 PM, Calvin Metcalf
<
>>>>>>>> calvin.metcalf@gmail.com
>>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I mean if your going to add new features
to couch you could
>>>>> just
>>>>>>>> have
>>>>>>>>>>> the
>>>>>>>>>>>> db generate a random uuid on creation that
would be
>>>> different
>>>>> if
>>>>>>> it
>>>>>>>>> was
>>>>>>>>>>>> deleted and recreated
>>>>>>>>>>>>> On Apr 13, 2014 1:59 PM, "Adam Kocoloski"
<
>>>>>>>> adam.kocoloski@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Other thoughts:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - We could enhance the authorization
system to have a role
>>>>> that
>>>>>>>>> allows
>>>>>>>>>>>>> updates to _local docs but nothing else.
It wouldn't make
>>>>> sense
>>>>>>> for
>>>>>>>>>>>>> completely untrusted peers, but it could
give peace of mind
>>>>> to
>>>>>>>>>>> sysadmins
>>>>>>>>>>>>> trying to execute replications with the
minimum level of
>>>>> access
>>>>>>>>>>> possible.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - We could teach the sequence index to
maintain a report of
>>>>>>> rolling
>>>>>>>>>>> hash
>>>>>>>>>>>>> of the {id,rev} pairs that comprise the
database up to that
>>>>>>>> sequence,
>>>>>>>>>>>>> record that in the replication checkpoint
document, and
>>>> check
>>>>>>> that
>>>>>>>>> it's
>>>>>>>>>>>>> unchanged on resume. It's a new API enhancement
and it
>>>> grows
>>>>>> the
>>>>>>>>>>> amount of
>>>>>>>>>>>>> information stored with each sequence,
but it completely
>>>>> closes
>>>>>>> off
>>>>>>>>> the
>>>>>>>>>>>>> probabilistic edge case associated with
simply checking
>>>> that
>>>>>> the
>>>>>>>> {id,
>>>>>>>>>>> rev}
>>>>>>>>>>>>> associated with the checkpointed sequence
has not changed.
>>>>>>> Perhaps
>>>>>>>>>>> overkill
>>>>>>>>>>>>> for what is admittedly a pretty low-probability
event.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Adam
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Apr 13, 2014, at 1:50 PM, Adam Kocoloski
<
>>>>>>>>> adam.kocoloski@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yeah, this is a subtle little thing.
The main reason we
>>>>>>> checkpoint
>>>>>>>>> on
>>>>>>>>>>>>> both source and target and compare is
to cover the case
>>>> where
>>>>>> the
>>>>>>>>>>> source
>>>>>>>>>>>>> database is deleted and recreated in
between replication
>>>>>>> attempts.
>>>>>>>> If
>>>>>>>>>>> that
>>>>>>>>>>>>> were to happen and the replicator just
resumes blindly from
>>>>> the
>>>>>>>>>>> checkpoint
>>>>>>>>>>>>> sequence stored on the target then the
replication could
>>>>>>>> permanently
>>>>>>>>>>> miss
>>>>>>>>>>>>> some documents written to the new source.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'd love to have a robust solution
for incremental
>>>>> replication
>>>>>>> of
>>>>>>>>>>>>> read-only databases. To first order a
UUID on the source
>>>>>> database
>>>>>>>>> that
>>>>>>>>>>> was
>>>>>>>>>>>>> fixed at create time could do the trick,
but we'll run into
>>>>>>> trouble
>>>>>>>>>>> with
>>>>>>>>>>>>> file-based backup and restores. If a
database file is
>>>>> restored
>>>>>>> to a
>>>>>>>>>>> point
>>>>>>>>>>>>> before the latest replication checkpoint
we'd again be in a
>>>>>>>> position
>>>>>>>>> of
>>>>>>>>>>>>> potentially permanently missing updates.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Calvin's suggestion of storing e.g.
{seq, id, rev} instead
>>>>> of
>>>>>>>> simply
>>>>>>>>>>> seq
>>>>>>>>>>>>> as the checkpoint information would dramatically
reduce the
>>>>>>>>> likelihood
>>>>>>>>>>> of
>>>>>>>>>>>>> that type of permanent skip in the replication,
but it's
>>>>> only a
>>>>>>>>>>>>> probabilistic answer.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Adam
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Apr 13, 2014, at 1:31 PM,
Calvin Metcalf <
>>>>>>>>>>> calvin.metcalf@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Though currently we have the
opposite problem right if we
>>>>>>> delete
>>>>>>>>> the
>>>>>>>>>>>>> target
>>>>>>>>>>>>>>> db? (this on me brain storming)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Could we store last rev in addition
to last seq?
>>>>>>>>>>>>>>>> On Apr 13, 2014 1:15 PM,
"Dale Harvey" <
>>>>> dale@arandomurl.com
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> If the src database was to
be wiped, when we restarted
>>>>>>>> replication
>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>> would happen until the source
database caught up to the
>>>>>>>> previously
>>>>>>>>>>>>> written
>>>>>>>>>>>>>>>> checkpoint
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> create A, write 5 documents
>>>>>>>>>>>>>>>> replicate 5 documents A ->
B, write checkpoint 5 on B
>>>>>>>>>>>>>>>> destroy A
>>>>>>>>>>>>>>>> write 4 documents
>>>>>>>>>>>>>>>> replicate A -> B, pick
up checkpoint from B and to
>>>>> ?since=5
>>>>>>>>>>>>>>>> .. no documents written
>>>> 
>> https://github.com/pouchdb/pouchdb/blob/master/tests/test.replication.js#L771is
>>>>>>>>>>>>>>>> our test that covers it
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 13 April 2014 18:02, Calvin
Metcalf <
>>>>>>>> calvin.metcalf@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> If we were to unilaterally
switch to checkpoint on
>>>> target
>>>>>>> what
>>>>>>>>>>> would
>>>>>>>>>>>>>>>>> happen, replication in
progress would loose their
>>>> place?
>>>>>>>>>>>>>>>>>> On Apr 13, 2014 11:21
AM, "Dale Harvey" <
>>>>>>> dale@arandomurl.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> So with checkpointing
we write the checkpoint to both
>>>> A
>>>>>> and
>>>>>>> B
>>>>>>>>> and
>>>>>>>>>>>>>>>> verify
>>>>>>>>>>>>>>>>>> they match before
using the checkpoint
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> What happens if the
src of the replication is read
>>>> only?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> As far as I can tell
couch will just checkout a
>>>>>>>>>>>>> checkpoint_commit_error
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> carry on from the
start, The only improvement I can
>>>>> think
>>>>>> of
>>>>>>>> is
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>> specifies they know
the src is read only and to only
>>>> use
>>>>>> the
>>>>>>>>>>> target
>>>>>>>>>>>>>>>>>> checkpoint, we can
'possibly' make that happen
>>>>>> automatically
>>>>>>>> if
>>>>>>>>>>> the
>>>>>>>>>>>>> src
>>>>>>>>>>>>>>>>>> specifically fails
the write due to permissions.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> -Calvin W. Metcalf
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> -Calvin W. Metcalf
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> -Calvin W. Metcalf
>>> 
>>> 
>>> 
>>> --
>>> -Calvin W. Metcalf
>> 
> 
> 
> 
> -- 
> -Calvin W. Metcalf


Mime
View raw message