Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 75429E63B for ; Tue, 5 Feb 2013 18:23:18 +0000 (UTC) Received: (qmail 27629 invoked by uid 500); 5 Feb 2013 18:23:16 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 27589 invoked by uid 500); 5 Feb 2013 18:23:16 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 27580 invoked by uid 99); 5 Feb 2013 18:23:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 18:23:16 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.210.48] (HELO mail-da0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 18:23:08 +0000 Received: by mail-da0-f48.google.com with SMTP id v40so181800dad.7 for ; Tue, 05 Feb 2013 10:22:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=51fxv+wciblS4me+0WwH4a9zmBnsitZSG60lOApu1nE=; b=YTXwH+QLyjejmdhmESUUuP9k7x0L/M8ysDUATK4UXrBVbuQvrgUwCUgmJkR/B3YBgC 2V7fOuI7ERGA3N+i+Zbmqkjg3WgaILB8p5GQtdx4pMVbb2WEsrsfaOCspJFzP+rVG+us DCdStaw2IDGpFjrOBFsuIl0U3xrHEyTo/7NNIyJaPrHuPHNaq0NyFbzwhDYCKtPGJmp2 MEK5n3dp+RtSHS1qlYURlJG5ZsCKJUXnmBX1O5V0XkePCBK/+Tr0pkKxBTRCAlO1320t zSulLlr80YTE6igxTOjSqbQh1ZvKnfrz4xj7DcEtyXJJn2LBa544SzOuHGaNrCqnTHJ0 N++Q== X-Received: by 10.66.79.202 with SMTP id l10mr66309382pax.36.1360088566260; Tue, 05 Feb 2013 10:22:46 -0800 (PST) Received: from [192.168.13.23] (71-84-176-101.dhcp.mdfd.or.charter.com. [71.84.176.101]) by mx.google.com with ESMTPS id l5sm31194078pax.10.2013.02.05.10.22.45 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 05 Feb 2013 10:22:45 -0800 (PST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Apple Message framework v1283) Subject: Re: tinkering with limits while replicating From: Nathan Vander Wilt In-Reply-To: <097C0DAA-FF04-4EC7-905C-22310DA33EC1@gmail.com> Date: Tue, 5 Feb 2013 10:22:44 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <097C0DAA-FF04-4EC7-905C-22310DA33EC1@gmail.com> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1283) X-Gm-Message-State: ALoCoQlEXHjQINzsaITtgpaNZ7FE4uICqC4CNsSjitrEYkSEXrm45fZLkgjypGeMOT/G0U4+dZ3a X-Virus-Checked: Checked by ClamAV on apache.org Hi Stephen, I've been doing some tests related to replication lately too = (continuous+filtered in my case). I suspect the reason Futon hangs is = because your whole VM is running out of RAM due to your very high = os_process_limit. I went in to a bit more detail in = http://mail-archives.apache.org/mod_mbox/couchdb-dev/201302.mbox/%3c70278F= 4A-FD08-4818-89B7-EA1B0AF846F5@gmail.com%3e but this setting basically = determines the size of the couchjs worker pool =97 you'd probably rather = have a bit of contention for the pool at a reasonable size (maybe ~100 = per GB free, tops?) than start paging. hth, -natevw On Feb 4, 2013, at 5:15 PM, Stephen Bartell wrote: > Hi all, >=20 > I'm hitting some limits while replicating , I'm hoping someone could = advise. =20 > Im running this in a VM on my macbook with the following allocated = resources: > ubuntu 11.04 > 4 cores @ 2.3ghz > 8 gb mem >=20 > I'm doing a one-to-many replication. =20 > 1) I create one db named test.=20 > 2) Then create [test_0 .. test_99] databases. =20 > 3) I then set up replications from test -> [test_0 .. test_99]. 100 = replications total. > 4) I finally go to test and create a doc, hit save. >=20 > When I hit save, futon becomes completely unresponsive for around = 10sec. It eventually returns to normal behavior. >=20 > Tailing the couchdb log I find waves of the following errors: > [Tue, 05 Feb 2013 00:46:26 GMT] [info] [<0.6936.1>] Retrying POST = request to http://admin:*****@localhost:5984/test_25/_revs_diff in 1.0 = seconds due to error {code,503} >=20 > I see that the replicator is finding the server to be unresponsive. = The waves of these messages show that replicator retries in 0.25 sec, = then 0.5 sec, then 1sec, then 2sec. This is expected. Everything = settles done after about 4 retries. =20 >=20 > So my first thought is resource limits. I threw the book at it and = set : > 1) max_dbs_open: 500 > 2) os_process_limit: 5000 > 3) http_connections: 20000 > 4) ulimit -Sn 4096 (the hard limit is 4096) >=20 > I really don't know whats reasonable for these values relative to how = many replications I am setting up. So these values, save max_dbs_open, = are all stabs in the dark. >=20 > No change in performance. >=20 > So, I'm at a loss now. what can I do to get all this to work? Or what = am I doing wrong? And note that this is only a test. I aim to = quadruple the amount of replications and have lots and lots of = insertions on the so called "test" database. Actually, there will be = several of these one-to-many databases. >=20 > I've heard people get systems up to thousands of dbs and replicators = running just fine. So I hope Im just not offering to right sacrifices = up to couchdb yet. >=20 > Thanks for any insight, >=20 > sb >=20