couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Damien Katz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions
Date Tue, 23 Aug 2011 20:56:28 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089755#comment-13089755
] 

Damien Katz commented on COUCHDB-1256:
--------------------------------------

I agree with the fix adam proposes. The code in question is an optimization to prevent the
sending/checking of documents we've already examined, but with checkpointing it breaks. Removal
of the code is the right fix for now.

In the future, we can add the optimization back if the check-pointing can keep note of completed
replications vs. checkpointed. Checkpointed records would keep a "high water mark" of the
last completed replication, and the seq num and that high mark for completed replication would
both be sent to the _changes handler. The _changes would not send docs with a seq below the
checkpoint value. When the replication checkpoints, it saves the current seq and the last
high water mark complete. When replication completes. it sets the last seq and high water
mark to the same seq, and that is gets sent for the next replication.

Also, continuous replication would need a way to signal when a replication is "complete" as
well, so that the high water mark can be set there as well.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in
1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator)
are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates
the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false
-d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277
of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style
is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message