couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Randall Leeds (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-704) Replication can lose checkpoints
Date Sat, 16 Oct 2010 18:12:22 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921723#action_12921723
] 

Randall Leeds commented on COUCHDB-704:
---------------------------------------

Filipe,

It's true. This is an edge case, but I have had it happen in production with a database that
had crawled to *very* slow writes and pull replication. The checkpoint code updated the source
first and the local document was written, but the response was too slow so it was taken as
a timeout. When the replicator retried the save it got a conflict. Replication crashed and
the target was never written.

I can imagine other, rare instances where this could occur. It's an edge case, but a potentially
nasty one.

> Replication can lose checkpoints
> --------------------------------
>
>                 Key: COUCHDB-704
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-704
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.11.2, 1.0.1
>            Reporter: Randall Leeds
>            Priority: Minor
>         Attachments: keep_session_id.patch, save-all-rep-checkpoints.patch, whitespace.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> When saving replication checkpoints in the _local/<repid> document the new entry
is always pushed onto the _original_ "history" list property that existed at the start of
the replication. When any number of things causes the checkpoint to be written to only one
of the databases the head of the history list gets out of sync. Subsequent attempts to start
this replication must start from the latest common replication log entry in the _original_
history, as though this replication never occurred.
> A better idea is to push every checkpoint onto the history instead of replacing the head
on each save.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message