couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Riyad Kalla <rka...@gmail.com>
Subject Re: replicating docs with tons of conflicts
Date Thu, 14 Mar 2013 18:09:50 GMT
Stephen,
I am probably wrong here (someone hop in and correct me), but I thought
Compaction would remove the old revisions (and conflicts) of docs.

Alternatively a question for the Couch devs, if Stephen set _revs_limit to
something artifically low, say 1, and restarted couch and did a compaction,
would that force the DB to smash down the datastore to 1 rev per doc and
remove the long-tail off these docs?

REF: http://wiki.apache.org/couchdb/Compaction

On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <snbartell@gmail.com>wrote:

> Hi all,
>
> tldr; I've got a database with just a couple docs.  Conflict management
> went unchecked and these docs have thousands of conflicts each.
>  Replication fails.  Couch consumes all the server's cpu.
>
> First the story, then the questions.  Please bear with me!
>
> I wanted to replicate this database to another, new database.  So I
> started the replication.  beam.smp took 100% of my cpu and the replicator
> status held steady at a constant percent for quite a while.  It eventually
> finished.
>
> I thought maybe I should handle the conflicts and then replicate.
>  Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
> replicated again but this time I could not get anything to replicate.
>  Again, cpu held steady, topped out. I eventually restarted couch.
>
> I dug throughout the logs and saw that the POSTS were failing.  I figure
> that the replicator was timing out when trying to post to couch.
>
> I have a replicator that I've been working on thats written in node.js.
>  So I started that one up to do the same thing.  I drew inspiration from
> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
> documentation, so my replicator follows more or less the same story.  1)
> consume _changes with style=all_docs.  2) revs_diff on the target database.
>  3) get each revision from source with revs=true.  4) bulk post with
> new_edits=false.
>
> Same thing.  Except now I can kind of make sense of whats going on.
>  Sucking the data out of the source is no problem.  Diffing the revs
> against the target is no problem.  Posting the docs is THE problem.  Since
> the database is clean, thousands of docs are being thrown at couch at once
> to build up the revision trees.  Couch is just taking forever in finishing
> the job.  It doesn't matter if I bulk post the docs or post them
> individually, couch sucks 100% of my cpu every time and takes forever to
> finish. (I actually never let it finish).
>
> So that is is the story. Here are my questions.
>
> 1) Has anyone else stepped on this mine?  If so, could I get pointed
> towards some workarounds?  I don't think it is right to make the assumption
> that users of couchdb will never have databases with huge conflict sausages
> like this. So simply saying manage your conflicts won't help.
>
> 2) Lets say I did manage my conflicts.  I still have the
> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs must be
> replicated to maintain consistency across the cluster.  If the replicator
> throws up when these huge sausages come through, how is the data ever going
> to replicate?  Is there a trade secret I don't know about?
>
> 3) Is there any limit on the resources that CouchDB is allowed to consume?
>  I can get that we run into these cases where theres tons of data to move
> and its just going to take a hell of a long time.  But I don't get why its
> permissible for CouchDB to eat all my cpu.  The whole server should never
> grind to a halt because its moving lots of data.  I feel like it should be
> like the little train who could.  Just chug along slow and steady until it
> crests the hill.
>
> I would really like to reply on the erlang replicator, but I can't.  At
> least with the replicator I wrote I have a chance with throttling the posts
> so CouchDB doesn't render my server useless.
>
> Sorry for wrapping more questions into those questions.  I'm pretty tired,
> stumped, and have machines in production crumbling.
>
> Best,
> Stephen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message