couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: proposed replication rev history changes
Date Sun, 08 Feb 2009 17:18:58 GMT
Patrick,

The issue at hand is precisely that the current revisioning system
makes this use case deterministically identifiable as a conflict. The
proposed change means that we introduce the possibility that we are
unable to determine if we have a real conflict or a 'conflict due to
missing history'.

Its possible I missed something that special casing the initial
revision would solve, but as I read the proposal it doesn't really fix
the underlying problem of possibly spurious conflicts while
introducing more complexity into the code.

HTH,
Paul Davis

On Sun, Feb 8, 2009 at 12:14 PM, Patrick Antivackis
<patrick.antivackis@gmail.com> wrote:
> And what today's revision system help in such a case ?
>
>
> 2009/2/8 Paul Davis <paul.joseph.davis@gmail.com>
>
>> On Sun, Feb 8, 2009 at 11:50 AM, Patrick Antivackis
>> <patrick.antivackis@gmail.com> wrote:
>> > I'm not sure I understood what you asked.
>> >
>> > It would be a conflict of document, that would need either manual
>> correction
>> > or why not an automatic correction applying a move to one of the
>> document,
>> > but at least couch can tell for sure it was not the same document at the
>> > origin.
>> >
>> > What I not understand is what today's revision system or proposed
>> revision
>> > system will bring more for this kind of conflict with two different
>> > documents are created with same Id on two different nodes ? Except that
>> with
>> > the new revision proposal, you don't know for sure it was same or
>> different
>> > document at the origin if replications occurs after you trimmed the
>> > reference to the first revision.
>> >
>>
>> I'm saying that your suggestion to always retain the first revision is
>> going to run into problems when a document is created on two machines
>> and thus has to initial revisions. Or rather, it will run into the
>> same problems as Damien's proposal yet have the added complexity that
>> we now have the special cased 'first revisions' info.
>>
>> Unless of course I'm missing something else in the details.
>>
>> >
>> >
>> >
>> > 2009/2/8 Paul Davis <paul.joseph.davis@gmail.com>
>> >
>> >> On Sun, Feb 8, 2009 at 6:07 AM, Patrick Antivackis
>> >> <patrick.antivackis@gmail.com> wrote:
>> >> > 2009/2/8 Damien Katz <damien@apache.org>
>> >> >
>> >> >> You got everything right except this. It doesn't solve the problem,
>> >> because
>> >> >> on another node, I could have a document that looked like ["1-foo"
>> >> "2-bif"].
>> >> >> That is a real edit conflict that wouldn't be caught by what I
think
>> you
>> >> are
>> >> >> proposing.
>> >> >>
>> >> >
>> >> > That's right,  there is a real edit conflict, but at least couchdb
can
>> >> > detect it based on the first revision reference that is always kept.
>> >> > If you not keep the reference of the first revision you can arrive
to
>> :
>> >> > BaseA : ["1-foo"]
>> >> > BaseB : empty
>> >> > Replication :
>> >> > BaseA : ["1-foo"]
>> >> > BaseB : ["1-foo"]
>> >> > Life goes on :
>> >> > BaseA : ["1-foo" "2-bar" "3-baz" "4-biz"] but as it's trimmed to 3
you
>> >> only
>> >> > keep ["2-bar" "3-baz" "4-biz"]
>> >> > BaseB : ["1-foo" "2-bad" "3-baf" "4-bif"] but as it's trimmed to 3
you
>> >> only
>> >> > keep ["2-bad" "3-baf" "4-bif"]
>> >> > New replication :
>> >> > ????? same Id but no common revision, what we do ? And couch can not
>> even
>> >> > help to say if it was same doc or not at the origin.
>> >> >
>> >>
>> >> Patrick,
>> >>
>> >> I'm pretty sure i see where you're coming from, but can you explain
>> >> what would happen if the same document ID were created on two servers?
>> >> Each server would have a different 'first rev' so who's first rev
>> >> would be carried on in the future?
>> >>
>> >> > This is used during conflict detection to figure out if 2 tree
>> fragments
>> >> >> overlap. We don't actually store a sequential number for each
>> revision,
>> >> we
>> >> >> store a revision tree of numbers, with the root of the tree being
the
>> >> offset
>> >> >> from 0 where it was trimmed (technically it's stemmed).
>> >> >>
>> >> >
>> >> > You are right, keep trace of the numbrer of the revision is indeed
>> >> important
>> >> > especially when a same origin document in updated on different
>> nodes.But
>> >> > couldn't it be replace bu a timestamp, this is sequential too and give
>> >> even
>> >> > more information.
>> >> >
>> >> >
>> >> >> Sometimes people can deal with spurious conflicts. This gives you
the
>> >> >> option. If you don't want spurious conflicts, don't use this feature.
>> >> >>
>> >> >> And if you want the same document to be editted over and over,
100s
>> of
>> >> >> thousands of times, this is really the only option. The revision
>> history
>> >> >> will get too big and slow down updates tremendously.
>> >> >>
>> >> >> Sure but  I would say it's different use cases. Record management
for
>> >> > examples needs to keep track of changes during a period of time. And
>> in
>> >> all
>> >> > CMS/ECM i have worked on, clean up of version is done on time base
>> more
>> >> than
>> >> > on number of revision having occured.
>> >> >
>> >>
>> >> HTH,
>> >> Paul Davis
>> >>
>> >
>>
>

Mime
View raw message