couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Antivackis <patrick.antivac...@gmail.com>
Subject Re: proposed replication rev history changes
Date Sun, 08 Feb 2009 18:38:55 GMT
Paul,
We agree about actual revision system and Damien's proposal.
But why keeping first revision reference in the history don't help to
determine if we have a real conflict or a 'conflict due to missing history'
?
I think the possible cases are :

- different documents with same ID created on different node :
Actual revision system : OK can find if real conflict based on full revision
history of both documents
Damien's proposal : KO or OK depending on if the 1st revision was trimmed
already
Keeping first revision combined with Damien's proposal : OK can find base on
1st revision

- same documents. The ID and the first revision are equals

Here we have different possibilities in the documents life :
Assume we have a trim done to just keep 3 values in the history
1)
NodeA :
_id : myDoc _rev : 4-marge actual-rev-history [1-bart, 2-lisa, 3-homer,
4-marge] damien-rev-history [2-lisa, 3-homer, 4-marge]
patrick-damien-rev-history [1-bart, 3-homer, 4-marge]
NodeB :
_id : myDoc _rev : 4-moe actual-rev-history [1-bart, 2-lisa, 3-homer, 4-moe]
damien-rev-history [2-lisa, 3-homer, 4-moe] patrick-damien-rev-history
[1-bart, 3-homer, 4-moe]

Actual revision system : OK can find edit conflict based on full revision
history of both documents
Damien's proposal : OK can find edit conflict  comparing [2-lisa, 3-homer,
4-marge] with [2-lisa, 3-homer, 4-moe]
Keeping first revision combined with Damien's proposal : OK can find edit
conflict comparing [1-bart, 3-homer, 4-marge] with [1-bart, 3-homer, 4-moe]

2)
NodeA :
_id : myDoc _rev : 4-marge actual-rev-history [1-bart, 2-lisa, 3-homer,
4-marge] damien-rev-history [2-lisa, 3-homer, 4-marge]
patrick-damien-rev-history [1-bart, 3-homer, 4-marge]
NodeB :
_id : myDoc _rev : 4-selma actual-rev-history [1-bart, 2-maggie, 3-patty,
4-selma] damien-rev-history [2-maggie, 3-patty, 4-selma]
patrick-damien-rev-history [1-bart, 3-patty, 4-selma]

Actual revision system : OK can find there is edit conflict
Damien's proposal : KO comparing [2-lisa, 3-homer, 4-marge] with [2-maggie,
3-patty, 4-selma] as it seems not same document
Keeping first revision combined with Damien's proposal : OK can find there
is edit conflict comparing [1-bart, 3-homer, 4-marge] with [1-bart, 3-patty,
4-selma]

3)
NodeA :
_id : myDoc _rev : 7-moe actual-rev-history [1-bart, 2-lisa, 3-homer,
4-marge, 5-patty, 6-selma, 7-moe] damien-rev-history [5-patty, 6-selma,
7-moe] patrick-damien-rev-history [1-bart, 6-selma, 7-moe]
NodeB :
_id : myDoc _rev : 4-marge actual-rev-history [1-bart, 2-lisa, 3-homer,
4-marge] damien-rev-history [ 2-lisa, 3-homer, 4-marge]
patrick-damien-rev-history [1-bart, 3-homer, 4-marge]

Actual revision system : OK can find there is no conflict, NodeA is just up
to date compare to nodeB
Damien's proposal : KO comparing [5-patty, 6-selma, 7-moe] with [ 2-lisa,
3-homer, 4-marge]
Keeping first revision combined with Damien's proposal :we compare [1-bart,
6-selma, 7-moe] with [1-bart, 3-homer, 4-marge], sure it's the same
document, but with have rev 6 and 7 on NodeA and rev 3 and 4 on NodeB. Using
the replication record on both source and target, we can find than revision
3 and 4 were already replicated, so for sure rev 6 and 7 coming from NodeA
are updates of the document. So OK we can find there is no conflict, NodeA
is just up to date compare to nodeB


So compared to Damien's proposal, keeping first revision seems a good idea
to me.

Do I miss something or am I wrong ?



2009/2/8 Paul Davis <paul.joseph.davis@gmail.com>

> Patrick,
>
> The issue at hand is precisely that the current revisioning system
> makes this use case deterministically identifiable as a conflict. The
> proposed change means that we introduce the possibility that we are
> unable to determine if we have a real conflict or a 'conflict due to
> missing history'.
>
> Its possible I missed something that special casing the initial
> revision would solve, but as I read the proposal it doesn't really fix
> the underlying problem of possibly spurious conflicts while
> introducing more complexity into the code.
>
> HTH,
> Paul Davis
>
> On Sun, Feb 8, 2009 at 12:14 PM, Patrick Antivackis
> <patrick.antivackis@gmail.com> wrote:
> > And what today's revision system help in such a case ?
> >
> >
> > 2009/2/8 Paul Davis <paul.joseph.davis@gmail.com>
> >
> >> On Sun, Feb 8, 2009 at 11:50 AM, Patrick Antivackis
> >> <patrick.antivackis@gmail.com> wrote:
> >> > I'm not sure I understood what you asked.
> >> >
> >> > It would be a conflict of document, that would need either manual
> >> correction
> >> > or why not an automatic correction applying a move to one of the
> >> document,
> >> > but at least couch can tell for sure it was not the same document at
> the
> >> > origin.
> >> >
> >> > What I not understand is what today's revision system or proposed
> >> revision
> >> > system will bring more for this kind of conflict with two different
> >> > documents are created with same Id on two different nodes ? Except
> that
> >> with
> >> > the new revision proposal, you don't know for sure it was same or
> >> different
> >> > document at the origin if replications occurs after you trimmed the
> >> > reference to the first revision.
> >> >
> >>
> >> I'm saying that your suggestion to always retain the first revision is
> >> going to run into problems when a document is created on two machines
> >> and thus has to initial revisions. Or rather, it will run into the
> >> same problems as Damien's proposal yet have the added complexity that
> >> we now have the special cased 'first revisions' info.
> >>
> >> Unless of course I'm missing something else in the details.
> >>
> >> >
> >> >
> >> >
> >> > 2009/2/8 Paul Davis <paul.joseph.davis@gmail.com>
> >> >
> >> >> On Sun, Feb 8, 2009 at 6:07 AM, Patrick Antivackis
> >> >> <patrick.antivackis@gmail.com> wrote:
> >> >> > 2009/2/8 Damien Katz <damien@apache.org>
> >> >> >
> >> >> >> You got everything right except this. It doesn't solve the
> problem,
> >> >> because
> >> >> >> on another node, I could have a document that looked like
["1-foo"
> >> >> "2-bif"].
> >> >> >> That is a real edit conflict that wouldn't be caught by what
I
> think
> >> you
> >> >> are
> >> >> >> proposing.
> >> >> >>
> >> >> >
> >> >> > That's right,  there is a real edit conflict, but at least couchdb
> can
> >> >> > detect it based on the first revision reference that is always
> kept.
> >> >> > If you not keep the reference of the first revision you can arrive
> to
> >> :
> >> >> > BaseA : ["1-foo"]
> >> >> > BaseB : empty
> >> >> > Replication :
> >> >> > BaseA : ["1-foo"]
> >> >> > BaseB : ["1-foo"]
> >> >> > Life goes on :
> >> >> > BaseA : ["1-foo" "2-bar" "3-baz" "4-biz"] but as it's trimmed
to 3
> you
> >> >> only
> >> >> > keep ["2-bar" "3-baz" "4-biz"]
> >> >> > BaseB : ["1-foo" "2-bad" "3-baf" "4-bif"] but as it's trimmed
to 3
> you
> >> >> only
> >> >> > keep ["2-bad" "3-baf" "4-bif"]
> >> >> > New replication :
> >> >> > ????? same Id but no common revision, what we do ? And couch can
> not
> >> even
> >> >> > help to say if it was same doc or not at the origin.
> >> >> >
> >> >>
> >> >> Patrick,
> >> >>
> >> >> I'm pretty sure i see where you're coming from, but can you explain
> >> >> what would happen if the same document ID were created on two
> servers?
> >> >> Each server would have a different 'first rev' so who's first rev
> >> >> would be carried on in the future?
> >> >>
> >> >> > This is used during conflict detection to figure out if 2 tree
> >> fragments
> >> >> >> overlap. We don't actually store a sequential number for each
> >> revision,
> >> >> we
> >> >> >> store a revision tree of numbers, with the root of the tree
being
> the
> >> >> offset
> >> >> >> from 0 where it was trimmed (technically it's stemmed).
> >> >> >>
> >> >> >
> >> >> > You are right, keep trace of the numbrer of the revision is indeed
> >> >> important
> >> >> > especially when a same origin document in updated on different
> >> nodes.But
> >> >> > couldn't it be replace bu a timestamp, this is sequential too
and
> give
> >> >> even
> >> >> > more information.
> >> >> >
> >> >> >
> >> >> >> Sometimes people can deal with spurious conflicts. This gives
you
> the
> >> >> >> option. If you don't want spurious conflicts, don't use this
> feature.
> >> >> >>
> >> >> >> And if you want the same document to be editted over and over,
> 100s
> >> of
> >> >> >> thousands of times, this is really the only option. The revision
> >> history
> >> >> >> will get too big and slow down updates tremendously.
> >> >> >>
> >> >> >> Sure but  I would say it's different use cases. Record management
> for
> >> >> > examples needs to keep track of changes during a period of time.
> And
> >> in
> >> >> all
> >> >> > CMS/ECM i have worked on, clean up of version is done on time
base
> >> more
> >> >> than
> >> >> > on number of revision having occured.
> >> >> >
> >> >>
> >> >> HTH,
> >> >> Paul Davis
> >> >>
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message