couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Bulk Load
Date Thu, 18 Sep 2008 08:57:05 GMT
HI Ronny,

not sure what you are trying to achieve here?

My solution is good for a single node instance, which is, if I
remember correctly what you asked for. It ignores the multi-node
setup and merging revisions over multiple nodes. Which is exactly
what CouchDB does for the simple reason that it is not easy to do :)

Since you are manually handing the list of past revisions, you'd need
to do the history merge on a multi-node conflict on your own.

Cheers
Jan
--

On Sep 18, 2008, at 03:35, Ronny Hanssen wrote:

> Hm.
>
> In Paul's case I am not 100% sure what is going on. Here's a use  
> case for
> two concurrent edits:
>  * First two users get the original.
>  * Both makes a copy which they save.
> This means that there are two fresh docs in CouchDB (even on a single
> node).
>  * Save the original using a new doc._id (which the copy is to  
> persist in
> copy.previous_version).
> This means that the two new docs know where to find their  previous
> versions. The problem I have with this scheme is that every change  
> of a
> document means that it needs to store not only the new version, but  
> also
> it's old version (in addition to the original). The fact that two  
> racing
> updates will generate 4(!) new docs in addition to the original  
> document is
> worrying. I guess Paul also want the original to be marked as  
> deleted in the
> _bulk_docs? But, in any case the previous version are now new two  
> new docs,
> but they look exactly the same, except for the doc._id, naturally...
>
> Wouldn't this be enough Paul?
> 1. old = get_doc()
> 2. update = clone(old);
> 3. update.previous_version = old._id;
> 4. post via _bulk_docs
>
> This way there won't be multiple old docs around.
>
> Jan's way ensures that for a view there is always only one current  
> version
> of a doc, since it is using the built-in rev-control. Competing  
> updates on
> the same node may fail which is then what CouchDB is designed to  
> handle. If
> on different nodes, then the rev-control history might come "out of  
> synch"
> via concurrent updates. How does CouchDB handle this? Which update  
> wins? On
> a single node this is intercepted when saving the doc. For multiple  
> nodes
> they might both get a response saying "save complete". So, these  
> then needs
> merging. How is that done? Jan further on secures the previous  
> version by
> storing the previous version as a new doc, allowing them to be  
> persisted
> beyond compaction. I guess Jan's sample would benefit nicely from  
> _bulk_docs
> too. I like this method due to the fact that it allows only one  
> current doc.
> But, I worry about how revision control handles conflicts, Jan?
>
> Paul and my updated suggestion always posts new versions, not using  
> the
> revision system at all. The downside is that there may be multiple  
> current
> versions around... And this is a bit tricky I believe... Anyone?
>
> Paul's suggestion also keeps multiple copies of the previous  
> version. I am
> not sure why, Paul?
>
>
> Regards,
> Ronny
>
> 2008/9/17 Paul Davis <paul.joseph.davis@gmail.com>
>
>> Good point chris.
>>
>> On Wed, Sep 17, 2008 at 11:39 AM, Chris Anderson <jchris@apache.org>
>> wrote:
>>> On Wed, Sep 17, 2008 at 11:34 AM, Paul Davis
>>> <paul.joseph.davis@gmail.com> wrote:
>>>> Alternatively something like the following might work:
>>>>
>>>> Keep an eye on the specifics of _bulk_docs though. There have been
>>>> requests to make it non-atomic, but I think in the face of  
>>>> something
>>>> like this we might make non-atomic _bulk_docs a non-default or some
>>>> such.
>>>
>>> I think the need for non-transaction bulk-docs will be obviated when
>>> we have the failure response say which docs caused failure, that way
>>> one can retry once to save all the non-conflicting docs, and then  
>>> loop
>>> back through to handle the conflicts.
>>>
>>> upshot: I bet you can count on bulk docs being transactional.
>>>
>>>
>>> --
>>> Chris Anderson
>>> http://jchris.mfdz.com
>>>
>>


Mime
View raw message