One more observation. It seems the memory goes up dramatically while the replicator task is writing all the failed-to-replicate-docs to the log (ends with this) ** Reason for termination == ** {http_request_failed,<<"failed to replicate http://host/db">>} Is there a way to disable logging for the replicator? Interestingly enough, as soon as we restart, the replicator simply catches up and pretends there were no problems. K. --- http://blog.mudynamics.com http://blitz.io @pcapr On Thu, Sep 1, 2011 at 7:18 AM, kowsik wrote: > Right before I sent this email we restarted CouchDB and now it's at > 14% memory usage and climbing. Is there anything we can look at > stats-wise and see where the pressure in the system is? I realize task > stats are being added to trunk, but on 1.1, anything? > > Thanks, > > K. > --- > http://blog.mudynamics.com > http://blitz.io > @pcapr > > On Thu, Sep 1, 2011 at 6:35 AM, Scott Feinberg wrote: >> I haven't had that issue-though I'm not using using 1.1 in a >> production environment, just using it to replicate like crazy (millions of >> docs in each of my 20+ databases).  I was running a server with 1 GB of >> memory and didn't have an issue, it handled it fine. >> >> However... from http://docs.couchbase.org/couchdb-release-1.1/index.html >> >> When you PUT/POST a document to the _replicator database, CouchDB will >> attempt to start the replication up to 10 times (configurable under >> [replicator], parameter max_replication_retry_count). >> >> Not sure if that helps. >> >> --Scott >> >> On Thu, Sep 1, 2011 at 9:28 AM, kowsik wrote: >> >>> Ran into this twice so far in production CouchDB in the last two days. >>> We are running CouchDB 1.1 on an EC2 AMI with multi-master replication >>> across two regions. I notice that every now and then CouchDB will >>> simply suck up 100% CPU 50% of the total memory and not respond at >>> all. So far the logs only show sporadic replication errors. One of the >>> stack traces (failed to replicate after 10 times) is about 500,000 >>> lines long. We are using the _replicator database. >>> >>> Anyone else running into this? Since 1.1 doesn't have the >>> try-until-infinity-and-beyond mode, we have a worker task that watches >>> the _replication_state and kicks the replicator as soon as it errors >>> out. Are there any settings in terms replicator memory usage, etc that >>> could help us? >>> >>> Thanks! >>> >>> K. >>> --- >>> http://blog.mudynamics.com >>> http://blitz.io >>> @pcapr >>> >> >