Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2A1B183B7 for ; Thu, 1 Sep 2011 16:38:41 +0000 (UTC) Received: (qmail 8427 invoked by uid 500); 1 Sep 2011 16:38:39 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 8358 invoked by uid 500); 1 Sep 2011 16:38:38 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 8349 invoked by uid 99); 1 Sep 2011 16:38:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Sep 2011 16:38:38 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kowsik@gmail.com designates 209.85.213.52 as permitted sender) Received: from [209.85.213.52] (HELO mail-yw0-f52.google.com) (209.85.213.52) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Sep 2011 16:38:32 +0000 Received: by yws29 with SMTP id 29so256898yws.11 for ; Thu, 01 Sep 2011 09:38:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=01SHPOLwsNgV9vfJRLeLfgvjr7YqHWGefSi/BOVRHSw=; b=Pk1CHQWt3zekaqTKaet9EGkDKbWIQ0O+7wq6gKJAvUwmopQ0IGMUtHJxYtMy7Wjv5b lilNKqqz3WxfgNxvdk8Mui3Pk5bcbPLySHFebVoZbdc6ToOi/wAwkpMKfkNFff2nb3Ce 0xo3SH5aM4JAO3RNMD+UXatyB505gFVRLVqWE= MIME-Version: 1.0 Received: by 10.68.16.232 with SMTP id j8mr347551pbd.392.1314895091551; Thu, 01 Sep 2011 09:38:11 -0700 (PDT) Received: by 10.68.42.228 with HTTP; Thu, 1 Sep 2011 09:38:11 -0700 (PDT) In-Reply-To: References: Date: Thu, 1 Sep 2011 09:38:11 -0700 Message-ID: Subject: Re: CouchDB 1.1 issue From: kowsik To: user@couchdb.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Some follow up questions that I'm hoping the dev's can answer. I don't grok Erlang so if these are dumb questions, humor me. The couchdb script launches erlang with the following parameters: -Bd - disable breaks -K true - what's this for? -A 4 - number of async threads - does concurrency improve if this is increa= sed? There also seems to be a number of options to set the stack size, heap size, etc. Anyone played around with these settings to get additional concurrency/performance boosts? Thanks, K. --- http://blog.mudynamics.com http://blitz.io @pcapr On Thu, Sep 1, 2011 at 7:29 AM, kowsik wrote: > One more observation. It seems the memory goes up dramatically while > the replicator task is writing all the failed-to-replicate-docs to the > log (ends with this) > > ** Reason for termination =3D=3D > ** {http_request_failed,<<"failed to replicate http://host/db">>} > > Is there a way to disable logging for the replicator? Interestingly > enough, as soon as we restart, the replicator simply catches up and > pretends there were no problems. > > K. > --- > http://blog.mudynamics.com > http://blitz.io > @pcapr > > On Thu, Sep 1, 2011 at 7:18 AM, kowsik wrote: >> Right before I sent this email we restarted CouchDB and now it's at >> 14% memory usage and climbing. Is there anything we can look at >> stats-wise and see where the pressure in the system is? I realize task >> stats are being added to trunk, but on 1.1, anything? >> >> Thanks, >> >> K. >> --- >> http://blog.mudynamics.com >> http://blitz.io >> @pcapr >> >> On Thu, Sep 1, 2011 at 6:35 AM, Scott Feinberg wrote: >>> I haven't had that issue-though I'm not using using 1.1 in a >>> production environment, just using it to replicate like crazy (millions= of >>> docs in each of my 20+ databases). =C2=A0I was running a server with 1 = GB of >>> memory and didn't have an issue, it handled it fine. >>> >>> However... from http://docs.couchbase.org/couchdb-release-1.1/index.htm= l >>> >>> When you PUT/POST a document to the _replicator database, CouchDB will >>> attempt to start the replication up to 10 times (configurable under >>> [replicator], parameter max_replication_retry_count). >>> >>> Not sure if that helps. >>> >>> --Scott >>> >>> On Thu, Sep 1, 2011 at 9:28 AM, kowsik wrote: >>> >>>> Ran into this twice so far in production CouchDB in the last two days. >>>> We are running CouchDB 1.1 on an EC2 AMI with multi-master replication >>>> across two regions. I notice that every now and then CouchDB will >>>> simply suck up 100% CPU 50% of the total memory and not respond at >>>> all. So far the logs only show sporadic replication errors. One of the >>>> stack traces (failed to replicate after 10 times) is about 500,000 >>>> lines long. We are using the _replicator database. >>>> >>>> Anyone else running into this? Since 1.1 doesn't have the >>>> try-until-infinity-and-beyond mode, we have a worker task that watches >>>> the _replication_state and kicks the replicator as soon as it errors >>>> out. Are there any settings in terms replicator memory usage, etc that >>>> could help us? >>>> >>>> Thanks! >>>> >>>> K. >>>> --- >>>> http://blog.mudynamics.com >>>> http://blitz.io >>>> @pcapr >>>> >>> >> >