Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D62B9286 for ; Thu, 12 Apr 2012 17:33:31 +0000 (UTC) Received: (qmail 80552 invoked by uid 500); 12 Apr 2012 17:33:29 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 80491 invoked by uid 500); 12 Apr 2012 17:33:29 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 80343 invoked by uid 99); 12 Apr 2012 17:33:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Apr 2012 17:33:29 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mkimber@kana.com designates 64.95.72.241 as permitted sender) Received: from [64.95.72.241] (HELO mxout.myoutlookonline.com) (64.95.72.241) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Apr 2012 17:33:22 +0000 Received: from mxout.myoutlookonline.com (localhost [127.0.0.1]) by mxout.myoutlookonline.com (Postfix) with ESMTP id 1D3418BECB0 for ; Thu, 12 Apr 2012 13:33:02 -0400 (EDT) X-Virus-Scanned: by SpamTitan at mail.lan Received: from HUB024.mail.lan (unknown [10.110.2.1]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by mxout.myoutlookonline.com (Postfix) with ESMTPS id 6550A8BE474 for ; Thu, 12 Apr 2012 13:32:32 -0400 (EDT) Received: from BE259.mail.lan ([10.110.32.159]) by HUB024.mail.lan ([10.110.17.24]) with mapi; Thu, 12 Apr 2012 13:32:09 -0400 From: Mike Kimber To: "user@couchdb.apache.org" Date: Thu, 12 Apr 2012 13:32:30 -0400 Subject: RE: BigCouch - Replication failing with Cannot Allocate memory Thread-Topic: BigCouch - Replication failing with Cannot Allocate memory Thread-Index: Ac0YyTLlkEbsnRH0TeS2G45WSHV8xwAB+MMw Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Ok, I have 3 nodes all load balanced with HAproxy: Centos 5.8 (Virtualised) 2 Cores=20 2GB RAM I'm trying to replicate about 75K documents which total 6GB when compacted = (0n Couchdb 1.2 which has compression turned on). I'm told they are fairly = large documents. When it goes pear shaped Vsmstat starts using a lot of memory: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu= ------ r b swpd free buff cache si so bi bo in cs us sy id = wa st 1 2 570576 8808 140 7208 2998 2249 3154 2249 1234 569 1 6 2 = 91 0 0 2 569656 9156 156 7504 2330 1899 2405 1904 1246 595 1 5 9 = 85 0 1 1 575412 9516 236 14928 1549 2261 3242 2261 1237 593 1 7 1 = 91 0 0 2 607092 13220 168 8156 3772 9012 3871 9017 1284 714 1 10 4 = 85 0 1 0 444336 857004 220 10212 5781 0 6202 0 1574 1010 13 7 33 = 47 0 1 0 442176 870684 428 11052 2049 0 2208 140 2561 1541 17 8 49 = 26 0 0 0 442176 813140 460 11968 170 0 348 0 2672 1565 25 9 61 = 4 0 0 1 442176 744972 484 12224 5440 0 5493 7 2432 900 8 4 49 = 40 0 0 1 442176 714048 484 12296 4547 0 4547 0 1799 827 4 2 50 = 44 0 0 1 442176 686304 496 12688 5128 0 5222 0 1696 999 9 2 50 = 40 0 0 3 444000 8712 444 12876 299 368 331 380 1294 188 22 20 36 = 23 0 0 3 469340 10040 116 7336 29 5087 74 5090 1232 268 3 22 0 = 75 0 1 2 584356 10220 124 6744 11367 28722 11370 28722 1643 1300 5 19 1= 7 59 0 0 1 624908 10640 132 7036 6518 12879 6590 12884 1296 717 3 10 29= 58 0 0 2 652556 10948 252 14776 3799 9494 5459 9494 1294 646 2 9 32 = 57 0 0 2 677784 10648 244 14528 3819 8196 3819 8201 1274 588 2 7 30 = 61 0 0 2 688460 9512 212 8224 3013 4522 3125 4522 1379 519 2 7 6 = 84 0 0 3 699164 9888 208 8468 2192 4014 2228 4014 1302 495 1 6 11 = 83 0 2 0 713104 9004 144 9192 2606 4490 2848 4490 1350 487 1 8 16 = 75 0 It only ever takes out one node at a time and the other nodes seem to be do= ing very little while the one node is running out of memory. If I kick it off again it processed some more and then spikes the memory an= d fails Thanks=20 Mike=20 PS: hope you enjoyed you couchdb get together! -----Original Message----- From: Robert Newson [mailto:rnewson@apache.org]=20 Sent: 12 April 2012 17:28 To: user@couchdb.apache.org Subject: Re: BigCouch - Replication failing with Cannot Allocate memory What kind of load were you putting the machine on? On 12 April 2012 17:24, Robert Newson wrote: > Could you show your vm.args file? > > On 12 April 2012 17:23, Robert Newson wrote: >> Unfortunately your request for help coincided with the two day CouchDB >> Summit. #cloudant and the Issues tab on cloudant/bigcouch are other >> ways to get bigcouch support, but we happily answer queries here too, >> when not at the Model UN of CouchDB. :D >> >> B. >> >> On 12 April 2012 17:10, Mike Kimber wrote: >>> Looks like this isn't the right place based on the responses so far. Sh= ame I hoped this was going to help solve our index/view rebuild times etc. >>> >>> Mike >>> >>> -----Original Message----- >>> From: Mike Kimber [mailto:mkimber@kana.com] >>> Sent: 10 April 2012 09:20 >>> To: user@couchdb.apache.org >>> Subject: BigCouch - Replication failing with Cannot Allocate memory >>> >>> I'm not sure if this is the correct place to raise an issue I am having= with replicating a standalone couchdb 1.1.1 to a 3 node BigCouch cluster? = If this is not the correct place please point me in the right direction if = it is then any one have any ideas why I keep getting the following error me= ssage when I kick of a replication; >>> >>> eheap_alloc: Cannot allocate 1459620480 bytes of memory (of type "heap"= ). >>> >>> My set-up is: >>> >>> Standalone couchdb 1.1.1 running on Centos 5.7 >>> >>> 3 Node BigCouch cluster running on Centos 5.8 with the following local.= ini overrides pulling from the Standalone couchdb (78K documents) >>> >>> [httpd] >>> bind_address =3D XXX.XX.X.XX >>> >>> [cluster] >>> ; number of shards for a new database >>> q =3D 9 >>> ; number of copies of each shard >>> n =3D 1 >>> >>> [couchdb] >>> database_dir =3D /other/bigcouch/database >>> view_index_dir =3D /other/bigcouch/view >>> >>> The error is always generate on the third node in the cluster and the s= erver basically max's out on memory before hand. The other nodes seem to be= doing very little, but are getting data i.e. the shard sizes are growing. = I've put the copies per shard down to 1 as currently I'm not interested in = resilience. >>> >>> Any help would be greatly appreciated. >>> >>> Mike >>>