incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dirk-Willem van Gulik <Dirk-Willem.van.Gu...@bbc.co.uk>
Subject User, Erlang, Couchdb or kernel error ?
Date Fri, 30 Jan 2009 15:10:52 GMT
Folks,

When blasting a CouchDB install (0.9.0a, r738928) with lots of requests 
  (see script[1], basically 2-8 writers, 8-32 readers) I see (regardless 
of R:W ratio or anything) the following behavior:

	Running version 0.9.0a-incubating on db test_6204
	#what num       count   ok	ops/sec
	reader	1	1000	0%	4000 ops/sec
	reader	2	1000	0%	4000 ops/sec
	reader	3	1000	0%	3200 ops/sec
	reader	4	1000	0%	2667 ops/sec
	writer	1	1000	100%	 640 ops/sec
	reader	10	1000	0%	1231 ops/sec
	reader	4	2000	0%	1600 ops/sec
	... lots more...[4]

	Connection error: 500 Can't connect to localhost:5984 (connect:
		 Cannot assign requested address) at
		/usr/lib/perl5/site_perl/5.8.8/CouchDB/
		Client/Doc.pm line 85

At this point every client gets a 'Cannot assign requested address'.

And the server is then down for some 20-30 seconds [2] before resuming.

An 'lsof' shows that the socket is still in LISTEN.


The server will recover by itself after some 30 seconds. Nothing in the 
couchDB log (debug, info or error log level)[3].

The issue happens on MacOSX (9.6.0) and Linux/Centos 2.6.18-92.1.17.el5) 
and I needed a dual core, etc machine to actually have the request 
hammer fast enough to cause this. On a laptop (or when copious debugging 
or 'info' level logging output slows the IO down to < 800 ops/second) 
one never hits this stage. SAS disks are easier than SATA disks.

< 20% CPU load during the test; disk/io is totally maxed out when you 
either 1) the dataset exceeds usual buffers or 2) do any sync. Note that 
this is a single instance on a single spindle shared with the OS in each 
case. Traffic is up to few Gbits.

Nothing in /var/log/messages or dmesg.

Any hints as to wether this is a user error (me beeing stupid), a coucdb 
error or I need to start to dive into the kernel or erlang[5] ?

Note that the behaviour on Linux and MacOS-X is identical. Note that 
various versions of /trunk seem to exhibit this.

Any advice ? Or shall I file a bug ?

Thanks,

Dw.

1: http://people.apache.org/~dirkx/p.pl

2: With the command:

	perl ~/p.pl ; /usr/sbin/lsof | grep couchdb | grep TCP
	while ! curl http://localhost:5984/; do date; sleep 1; done
    one gets the output:
	.. all childs exiting..
	beam.smp   5534   couchdb   ...
		TCP localhost.localdomain:5984 (LISTEN)
	curl: (7) Failed to connect to 127.0.0.1:
		Cannot assign requested address
	Fri Jan 30 14:34:13 GMT 2009
	...
	Fri Jan 30 14:34:38 GMT 2009
	$

3: tail end:
	[info] [<0.3294.1>] 127.0.0.1 - - 'GET' /test_7986/06455 404
	[info] [<0.3255.1>] 127.0.0.1 - - 'PUT' /test_7986/2797 201
	[info] [<0.3270.1>] 127.0.0.1 - - 'PUT' /test_7986/31176 201
	[info] [<0.3285.1>] 127.0.0.1 - - 'PUT' /test_7986/1870 201
	[info] [<0.3298.1>] 127.0.0.1 - - 'PUT' /test_7986/0989 201	
	.. server very silent...
	[debug] [<0.3304.1>] 'GET' / {1,1}
	the 'first curl get' of above getting through.

4: Ignore the 'ok' field - that is ok.

5: Linux
	Erlang (BEAM) emulator version 5.6.3 [source] [64-bit]
		[smp:8] [async-threads:0] [hipe] [kernel-poll:false]
    MacOSX
	Erlang (BEAM) emulator version 5.6.3 [source]
		[async-threads:0] [kernel-poll:false]



http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are
not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify
the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
					

Mime
View raw message