Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 75860 invoked from network); 2 Dec 2010 11:30:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Dec 2010 11:30:16 -0000 Received: (qmail 12990 invoked by uid 500); 2 Dec 2010 11:30:12 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 12575 invoked by uid 500); 2 Dec 2010 11:30:11 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 12192 invoked by uid 99); 2 Dec 2010 11:30:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Dec 2010 11:30:10 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.82.180] (HELO mail-wy0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Dec 2010 11:30:01 +0000 Received: by wyb29 with SMTP id 29so3412901wyb.11 for ; Thu, 02 Dec 2010 03:29:41 -0800 (PST) Received: by 10.216.18.204 with SMTP id l54mr8929159wel.71.1291289381043; Thu, 02 Dec 2010 03:29:41 -0800 (PST) Received: from [192.168.47.216] (ns0.netdev.co.uk [87.252.58.29]) by mx.google.com with ESMTPS id x59sm198082weq.14.2010.12.02.03.29.39 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 02 Dec 2010 03:29:39 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) Subject: Re: Read request throughput From: Huw Selley In-Reply-To: Date: Thu, 2 Dec 2010 11:29:38 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: References: <52216727-630A-49F6-B919-E691007F1361@netdev.co.uk> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1082) X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the response Adam :)=20 Some updates below: On 1 Dec 2010, at 13:30, Adam Kocoloski wrote: >=20 > So the Erlang VM starts 16 schedulers by default, right? Some people = have reported improvements in Erlang application performance with = HyperThreading disabled, but I've not heard of any CouchDB-specific = tests of that option yet. Yeah, that's right - 16:16 by default. >=20 >> The database is pretty small (just under 100K docs) and I am querying = a view that includes some other docs (the request contains = include_docs=3Dtrue) and using jmeter on another identical box to = generate the traffic. >=20 > include_docs=3Dtrue is definitely more work at read time than = embedding the docs in the view index. I'm not sure about your = application design constraints, but given that your database and index = seem to fit entirely in RAM at the moment you could experiment with = emitting the doc in your map function instead ... >=20 >> The total amount of data returned from the request is 1467 bytes. >=20 > ... especially when the documents are this small. Sure, but I would have expected that to only really help if the system = was contending for resources? I am using linked docs so not sure about = emitting the entire doc in the view. >>=20 >>=20 >> Before coming here to question my findings I took a 3rd box (same = spec) and built couch from the tip of the 1.1.x branch (rev 1040477). = After compiling couch and installing it I found that it didn't start up = (or log anything useful). After a bit of digging I figured it's probably = due to the age of the erlang version being used - I upgraded to OTP R14B = and rebuilt couch against it. This gave me a working install again. >=20 > Hmm, I've heard that we did something to break compatibility with = 12B-5 recently. We should either fix it or bump the required version. = Thanks for the note. COUCHDB-856? >=20 >> I got an immediate throughput increase to ~500 requests/s which was = nice but the data being collected via sadc still showed that the cpu was = at most 20% utilised and the disk controller was doing next to nothing = (I assume the OS cache already has the data requested so no trip to disk = required?) >>=20 >> At this point I started to wonder if jmeter is unable to send in = enough requests to stress couch so I started up another jmeter instance = on another box and had it also send in requests to couch. What i noticed = was that the total throughput didn't increase - it was just split over = both jmeter instances. >=20 > How many concurrent requests are submitted by each jmeter instance? 25. >=20 > Do you know if the CPU load was spread across cores or concentrated on = a single one? One thing Kenneth did not mention in that thread is that = you can now bind Erlang schedulers to specific cores. By default the = schedulers are unbound; maybe RHEL is doing a poor job of distributing = them. You can bind them using the default strategy for your CPUs by = starting the VM with the "+sbt db" option. It was using most of 2 cores. I had a go with "+sbt db" and it didn't = perform as well as "-S 16:2". WRT disabling HT - I need to take a trip to the datacentre to disable HT = in the bios but I tried disabling some cores with: echo 0 > /sys/devices/system/node/nodeX/cpuX/online Which should stop the kernel seeing the core - not as clean as disabling = it in the bios but should suffice. /proc/cpuinfo stopped showing the = cores I removed so it looks like it worked. Again I didn't see any improvement. Cheers Huw=