couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <randall.le...@gmail.com>
Subject Re: Is it possible to bring back optional old all-or-nothing behaviour?
Date Fri, 23 Dec 2011 04:51:33 GMT
On Thu, Dec 22, 2011 at 20:46, Randall Leeds <randall.leeds@gmail.com> wrote:
> On Thu, Dec 22, 2011 at 20:18, Alexander Uvarov
> <alexander.uvarov@gmail.com> wrote:
>>
>> On Dec 23, 2011, at 1:49 AM, Paul Davis wrote:
>>
>>> On Thu, Dec 22, 2011 at 11:31 AM, Robert Newson <rnewson@apache.org> wrote:
>>>> In my opinion, and I believe the majority opinion of the group, the
>>>> CouchDB API should be the same everywhere. This specifically includes
>>>> not doing things on a single box that will not work in a
>>>> clustered/sharded situation. It's why our transactions are scoped to a
>>>> single document, for example.
>>>>
>>>> I will also note that all_or_nothing does not provide multi-document
>>>> ACID transactions. The batches used in bulk_docs are not recorded, so
>>>> those items will be replicated individually (and in parallel, so not
>>>> even in a predictable order), which would break the C and I
>>>> characteristics on the receiving server. The old semantic would abort
>>>> the whole update if any one of the documents couldn't be updated but
>>>> the new semantic simply introduces a conflict in that case.
>>>>
>>>
>>> Slight nit pick, but new behavior just returns the error that the
>>> update would *cause* the conflict. (Assuming default non-replicator
>>> _bulk_docs calls.)
>>>
>>
>> Am I missing something? Current bulk_docs implementation will introduce a conflict
in case of conflict, not just reject and return the error.
>>
>>>> B.
>>>>
>>>> On 22 December 2011 16:48, Alexander Uvarov <alexander.uvarov@gmail.com>
wrote:
>>>>> And can become much easier with multi-document transactions as an option.
>>>>>
>>>>> On Thu, Dec 22, 2011 at 10:43 PM, Pepijn de Vos <pepijndevos@yahoo.com>
wrote:
>>>>>> But not everyone needs a cluster. I like CouchDB because it's easy,
not because "it scales", and in some situations, all_or_nothing is easy.
>>>>>>
>>>
>>> Robert mentions it in passing, but the biggest reason that we dropped
>>> the original _bulk_docs behavior doesn't have anything to do with
>>> clustering. It was because the semantics are violated as soon as you
>>> try and replicate. Since there's no tracking of the group of docs
>>> posted to _bulk_docs then as soon as your mobile client tried to move
>>> data in or out you'd lose all three of ACI in ACID.
>>
>> Ain't every system with multi-master architecture will cause problems as soon as
you try to replicate? Should this force people to design for replication even them don't need
it? In my first message I mentioned that not every application need to be replicated. There
are a thousands of such apps in the world. Even it's possible to design some app for replication,
it can be very hard to do and developer and probably future users will spend a lot of time
for superfluous.
>
> It's possible, but expensive, to have multi-master architecture and
> transaction isolation, but it involves distributed commit protocols.
>
> The wiki documentation is maybe slightly misleading in that the
> guarantees provided by the current Apache CouchDB around
> all_or_nothing have nothing to do with database crashes. All
> _bulk_docs requests are written as a single group commit with a single
> database header write, so either all valid, non-conflicting writes are
> durably stored or none are. all_or_nothing lets validation functions
> reject the whole bulk rather than just the failing write, and then
> during the commit phase create conflicts rather than returning an
> error.
>
> Here's the key: if your documents are known to be valid (or you don't
> have a validate_doc_update function in your database), then the
> difference is only whether or not conflicts are created or rejected,
> not whether all writes hit disk durably or not, as the wiki might seem
> to suggest.
>
> The replicator uses a flag on the query parameter to create conflicts
> rather than rejecting them: ?new_edits=false. If you can tolerate
> conflicts please feel free to create your own revision ids (bump the
> leading number, create a random id, and slap them together with a
> dash) and use ?new_edits=false. You'll get the same semantics with
> respect to conflicts as all_or_nothing. You lose little by generating
> your own revision ids since deterministic revisions is an optimization
> for replication. Maybe that lets you move forward with your use case.
>
> More to the point though... I find replication is one of CouchDB's
> killer features and that's why some devs (like me and Paul) would
> rather see all_or_nothing vanish completely. If you need relational
> consistency but not replication you might be better served elsewhere.
> I won't tell you to go away (I love our users, and so I'm offering a
> lesser-known workaround with ?new_edits) but I won't mislead you about
> the goals of the project either.
>
> -Randall

I didn't realize when I wrote this that new_edits is actually
documented [1]. I hope that helps!

Cheers,
Randall

[1] https://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Posting_Existing_Revisions

Mime
View raw message