Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D5B8E574 for ; Tue, 5 Feb 2013 19:02:06 +0000 (UTC) Received: (qmail 93287 invoked by uid 500); 5 Feb 2013 19:02:04 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 93238 invoked by uid 500); 5 Feb 2013 19:02:04 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 93230 invoked by uid 99); 5 Feb 2013 19:02:04 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 19:02:04 +0000 Received: from localhost (HELO mail-vc0-f169.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 19:02:04 +0000 Received: by mail-vc0-f169.google.com with SMTP id n10so288548vcn.14 for ; Tue, 05 Feb 2013 11:02:03 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.58.28.169 with SMTP id c9mr24952996veh.5.1360090923292; Tue, 05 Feb 2013 11:02:03 -0800 (PST) Received: by 10.52.68.209 with HTTP; Tue, 5 Feb 2013 11:02:03 -0800 (PST) In-Reply-To: <6DC7904E-33AA-413B-813C-E236E08C3C71@gmail.com> References: <097C0DAA-FF04-4EC7-905C-22310DA33EC1@gmail.com> <6DC7904E-33AA-413B-813C-E236E08C3C71@gmail.com> Date: Tue, 5 Feb 2013 19:02:03 +0000 Message-ID: Subject: Re: tinkering with limits while replicating From: Robert Newson To: "user@couchdb.apache.org" Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable If you change settings via http requests to _config, no, but if you just changed the .ini file on disk, yes. It's best to use PUT/GET to _config/section/key imo. B. On 5 February 2013 18:56, Stephen Bartell wrote: > Nathan, > > I dropped the pool size down to 500 and still the same story. I also tri= ed lower the number of replicator processes down to 1 per replicator. Agai= n same thing. > > All the while, I keep an eye on how much memory beam.smp consumes during = one replication "wave" and it never exceeds 2%. So Im reluctant to think t= hat the os is running out of memory. It does seem like theres some sort of= process contention however. The error code that the replicators are repor= ting while trying to POST is 503. I assume that this is for the web server= being unavailable. > > Yes Im going to add filtering on top of this, and I think Im going to nee= d to do those in eel, although Id like to try first to avoid it. > > This is probably a dumb question, do I need to restart couch after change= s with these settings? > > > On Feb 5, 2013, at 10:22 AM, Nathan Vander Wilt wrote: > >> Hi Stephen, >> >> I've been doing some tests related to replication lately too (continuous= +filtered in my case). I suspect the reason Futon hangs is because your who= le VM is running out of RAM due to your very high os_process_limit. I went = in to a bit more detail in http://mail-archives.apache.org/mod_mbox/couchdb= -dev/201302.mbox/%3c70278F4A-FD08-4818-89B7-EA1B0AF846F5@gmail.com%3e but t= his setting basically determines the size of the couchjs worker pool =97 yo= u'd probably rather have a bit of contention for the pool at a reasonable s= ize (maybe ~100 per GB free, tops?) than start paging. >> >> hth, >> -natevw >> >> >> >> On Feb 4, 2013, at 5:15 PM, Stephen Bartell wrote: >> >>> Hi all, >>> >>> I'm hitting some limits while replicating , I'm hoping someone could ad= vise. >>> Im running this in a VM on my macbook with the following allocated reso= urces: >>> ubuntu 11.04 >>> 4 cores @ 2.3ghz >>> 8 gb mem >>> >>> I'm doing a one-to-many replication. >>> 1) I create one db named test. >>> 2) Then create [test_0 .. test_99] databases. >>> 3) I then set up replications from test -> [test_0 .. test_99]. 100 re= plications total. >>> 4) I finally go to test and create a doc, hit save. >>> >>> When I hit save, futon becomes completely unresponsive for around 10sec= . It eventually returns to normal behavior. >>> >>> Tailing the couchdb log I find waves of the following errors: >>> [Tue, 05 Feb 2013 00:46:26 GMT] [info] [<0.6936.1>] Retrying POST reque= st to http://admin:*****@localhost:5984/test_25/_revs_diff in 1.0 seconds d= ue to error {code,503} >>> >>> I see that the replicator is finding the server to be unresponsive. Th= e waves of these messages show that replicator retries in 0.25 sec, then 0.= 5 sec, then 1sec, then 2sec. This is expected. Everything settles done af= ter about 4 retries. >>> >>> So my first thought is resource limits. I threw the book at it and set= : >>> 1) max_dbs_open: 500 >>> 2) os_process_limit: 5000 >>> 3) http_connections: 20000 >>> 4) ulimit -Sn 4096 (the hard limit is 4096) >>> >>> I really don't know whats reasonable for these values relative to how m= any replications I am setting up. So these values, save max_dbs_open, are= all stabs in the dark. >>> >>> No change in performance. >>> >>> So, I'm at a loss now. what can I do to get all this to work? Or what = am I doing wrong? And note that this is only a test. I aim to quadruple = the amount of replications and have lots and lots of insertions on the so c= alled "test" database. Actually, there will be several of these one-to-man= y databases. >>> >>> I've heard people get systems up to thousands of dbs and replicators ru= nning just fine. So I hope Im just not offering to right sacrifices up to = couchdb yet. >>> >>> Thanks for any insight, >>> >>> sb >>> >> >