incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ronny Hanssen" <super.ro...@gmail.com>
Subject Re: Bulk Load
Date Thu, 18 Sep 2008 01:35:29 GMT
Hm.

In Paul's case I am not 100% sure what is going on. Here's a use case for
two concurrent edits:
  * First two users get the original.
  * Both makes a copy which they save.
This means that there are two fresh docs in CouchDB (even on a single
node).
  * Save the original using a new doc._id (which the copy is to persist in
copy.previous_version).
This means that the two new docs know where to find their  previous
versions. The problem I have with this scheme is that every change of a
document means that it needs to store not only the new version, but also
it's old version (in addition to the original). The fact that two racing
updates will generate 4(!) new docs in addition to the original document is
worrying. I guess Paul also want the original to be marked as deleted in the
_bulk_docs? But, in any case the previous version are now new two new docs,
but they look exactly the same, except for the doc._id, naturally...

Wouldn't this be enough Paul?
1. old = get_doc()
2. update = clone(old);
3. update.previous_version = old._id;
4. post via _bulk_docs

This way there won't be multiple old docs around.

Jan's way ensures that for a view there is always only one current version
of a doc, since it is using the built-in rev-control. Competing updates on
the same node may fail which is then what CouchDB is designed to handle. If
on different nodes, then the rev-control history might come "out of synch"
via concurrent updates. How does CouchDB handle this? Which update wins? On
a single node this is intercepted when saving the doc. For multiple nodes
they might both get a response saying "save complete". So, these then needs
merging. How is that done? Jan further on secures the previous version by
storing the previous version as a new doc, allowing them to be persisted
beyond compaction. I guess Jan's sample would benefit nicely from _bulk_docs
too. I like this method due to the fact that it allows only one current doc.
But, I worry about how revision control handles conflicts, Jan?

Paul and my updated suggestion always posts new versions, not using the
revision system at all. The downside is that there may be multiple current
versions around... And this is a bit tricky I believe... Anyone?

Paul's suggestion also keeps multiple copies of the previous version. I am
not sure why, Paul?


Regards,
Ronny

2008/9/17 Paul Davis <paul.joseph.davis@gmail.com>

> Good point chris.
>
> On Wed, Sep 17, 2008 at 11:39 AM, Chris Anderson <jchris@apache.org>
> wrote:
> > On Wed, Sep 17, 2008 at 11:34 AM, Paul Davis
> > <paul.joseph.davis@gmail.com> wrote:
> >> Alternatively something like the following might work:
> >>
> >> Keep an eye on the specifics of _bulk_docs though. There have been
> >> requests to make it non-atomic, but I think in the face of something
> >> like this we might make non-atomic _bulk_docs a non-default or some
> >> such.
> >
> > I think the need for non-transaction bulk-docs will be obviated when
> > we have the failure response say which docs caused failure, that way
> > one can retry once to save all the non-conflicting docs, and then loop
> > back through to handle the conflicts.
> >
> > upshot: I bet you can count on bulk docs being transactional.
> >
> >
> > --
> > Chris Anderson
> > http://jchris.mfdz.com
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message