Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@couchdb.apache.org
Received-SPF: pass (nike.apache.org: domain of mkimber@kana.com designates
 64.95.72.241 as permitted sender)
From: Mike Kimber <mkimber@kana.com>
To: "user@couchdb.apache.org" <user@couchdb.apache.org>
Date: Thu, 12 Apr 2012 13:32:30 -0400
Subject: RE: BigCouch - Replication failing with Cannot Allocate memory
Thread-Topic: BigCouch - Replication failing with Cannot Allocate memory
Thread-Index: Ac0YyTLlkEbsnRH0TeS2G45WSHV8xwAB+MMw
Message-ID: <A7D50E04F38FD44D9D914F2ABCA592BF2DE1B23003@BE259.mail.lan>
References: <A7D50E04F38FD44D9D914F2ABCA592BF2DE19FE78E@BE259.mail.lan>
	<A7D50E04F38FD44D9D914F2ABCA592BF2DE1B22FE6@BE259.mail.lan>
	<CABvT1DFLb0yh34LXRBVA01yVxRe_GyMCA7wYuwW8HmbXxgDygw@mail.gmail.com>
	<CABvT1DGaXYFY0u9N0G38FHy=VDm81xfMc2xeAkkmDCgQdeZWig@mail.gmail.com>
 <CABvT1DEQ6i76c+gZtR4xi0j5FKPW_W_xe46brUF+=Qz2eiQdyw@mail.gmail.com>
In-Reply-To: 
 <CABvT1DEQ6i76c+gZtR4xi0j5FKPW_W_xe46brUF+=Qz2eiQdyw@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Ok, I have 3 nodes all load balanced with HAproxy:

Centos 5.8 (Virtualised)
2 Cores=20
2GB RAM

I'm trying to replicate about 75K documents which total 6GB when compacted =
(0n Couchdb 1.2 which has compression turned on). I'm told they are fairly =
large documents.

When it goes pear shaped Vsmstat starts using a lot of memory:

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu=
------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id =
wa st
 1  2 570576   8808    140   7208 2998 2249  3154  2249 1234  569  1  6  2 =
91  0
 0  2 569656   9156    156   7504 2330 1899  2405  1904 1246  595  1  5  9 =
85  0
 1  1 575412   9516    236  14928 1549 2261  3242  2261 1237  593  1  7  1 =
91  0
 0  2 607092  13220    168   8156 3772 9012  3871  9017 1284  714  1 10  4 =
85  0
 1  0 444336 857004    220  10212 5781    0  6202     0 1574 1010 13  7 33 =
47  0
 1  0 442176 870684    428  11052 2049    0  2208   140 2561 1541 17  8 49 =
26  0
 0  0 442176 813140    460  11968  170    0   348     0 2672 1565 25  9 61 =
 4  0
 0  1 442176 744972    484  12224 5440    0  5493     7 2432  900  8  4 49 =
40  0
 0  1 442176 714048    484  12296 4547    0  4547     0 1799  827  4  2 50 =
44  0
 0  1 442176 686304    496  12688 5128    0  5222     0 1696  999  9  2 50 =
40  0
 0  3 444000   8712    444  12876  299  368   331   380 1294  188 22 20 36 =
23  0
 0  3 469340  10040    116   7336   29 5087    74  5090 1232  268  3 22  0 =
75  0
 1  2 584356  10220    124   6744 11367 28722 11370 28722 1643 1300  5 19 1=
7 59  0
 0  1 624908  10640    132   7036 6518 12879  6590 12884 1296  717  3 10 29=
 58  0
 0  2 652556  10948    252  14776 3799 9494  5459  9494 1294  646  2  9 32 =
57  0
 0  2 677784  10648    244  14528 3819 8196  3819  8201 1274  588  2  7 30 =
61  0
 0  2 688460   9512    212   8224 3013 4522  3125  4522 1379  519  2  7  6 =
84  0
 0  3 699164   9888    208   8468 2192 4014  2228  4014 1302  495  1  6 11 =
83  0
 2  0 713104   9004    144   9192 2606 4490  2848  4490 1350  487  1  8 16 =
75  0

It only ever takes out one node at a time and the other nodes seem to be do=
ing very little while the one node is running out of memory.

If I kick it off again it processed some more and then spikes the memory an=
d fails

Thanks=20

Mike=20

PS: hope you enjoyed you couchdb get together!

-----Original Message-----
From: Robert Newson [mailto:rnewson@apache.org]=20
Sent: 12 April 2012 17:28
To: user@couchdb.apache.org
Subject: Re: BigCouch - Replication failing with Cannot Allocate memory

What kind of load were you putting the machine on?

On 12 April 2012 17:24, Robert Newson <rnewson@apache.org> wrote:
> Could you show your vm.args file?
>
> On 12 April 2012 17:23, Robert Newson <rnewson@apache.org> wrote:
>> Unfortunately your request for help coincided with the two day CouchDB
>> Summit. #cloudant and the Issues tab on cloudant/bigcouch are other
>> ways to get bigcouch support, but we happily answer queries here too,
>> when not at the Model UN of CouchDB. :D
>>
>> B.
>>
>> On 12 April 2012 17:10, Mike Kimber <mkimber@kana.com> wrote:
>>> Looks like this isn't the right place based on the responses so far. Sh=
ame I hoped this was going to help solve our index/view rebuild times etc.
>>>
>>> Mike
>>>
>>> -----Original Message-----
>>> From: Mike Kimber [mailto:mkimber@kana.com]
>>> Sent: 10 April 2012 09:20
>>> To: user@couchdb.apache.org
>>> Subject: BigCouch - Replication failing with Cannot Allocate memory
>>>
>>> I'm not sure if this is the correct place to raise an issue I am having=
 with replicating a standalone couchdb 1.1.1 to a 3 node BigCouch cluster? =
If this is not the correct place please point me in the right direction if =
it is then any one have any ideas why I keep getting the following error me=
ssage when I kick of a replication;
>>>
>>> eheap_alloc: Cannot allocate 1459620480 bytes of memory (of type "heap"=
).
>>>
>>> My set-up is:
>>>
>>> Standalone couchdb 1.1.1 running on Centos 5.7
>>>
>>> 3 Node BigCouch cluster running on Centos 5.8 with the following local.=
ini overrides pulling from the Standalone couchdb (78K documents)
>>>
>>> [httpd]
>>> bind_address =3D XXX.XX.X.XX
>>>
>>> [cluster]
>>> ; number of shards for a new database
>>> q =3D 9
>>> ; number of copies of each shard
>>> n =3D 1
>>>
>>> [couchdb]
>>> database_dir =3D /other/bigcouch/database
>>> view_index_dir =3D /other/bigcouch/view
>>>
>>> The error is always generate on the third node in the cluster and the s=
erver basically max's out on memory before hand. The other nodes seem to be=
 doing very little, but are getting data i.e. the shard sizes are growing. =
I've put the copies per shard down to 1 as currently I'm not interested in =
resilience.
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Mike
>>>