Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@couchdb.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Apple Message framework v1082)
Subject: Re: Read request throughput
From: Huw Selley <huw.selley@netdev.co.uk>
In-Reply-To: <B2EA8E21-0B56-48A4-8ED4-106869EBE5C6@apache.org>
Date: Thu, 2 Dec 2010 11:29:38 +0000
Content-Transfer-Encoding: quoted-printable
Message-Id: <A6885FDF-732C-4012-A94E-C73C406541DE@netdev.co.uk>
References: <52216727-630A-49F6-B919-E691007F1361@netdev.co.uk>
 <B2EA8E21-0B56-48A4-8ED4-106869EBE5C6@apache.org>
To: user@couchdb.apache.org

Thanks for the response Adam :)=20
Some updates below:

On 1 Dec 2010, at 13:30, Adam Kocoloski wrote:
<snip>
>=20
> So the Erlang VM starts 16 schedulers by default, right?  Some people =
have reported improvements in Erlang application performance with =
HyperThreading disabled, but I've not heard of any CouchDB-specific =
tests of that option yet.

Yeah, that's right - 16:16 by default.

<snip>

>=20
>> The database is pretty small (just under 100K docs) and I am querying =
a view that includes some other docs (the request contains =
include_docs=3Dtrue) and using jmeter on another identical box to =
generate the traffic.
>=20
> include_docs=3Dtrue is definitely more work at read time than =
embedding the docs in the view index.  I'm not sure  about your =
application design constraints, but given that your database and index =
seem to fit entirely in RAM at the moment you could experiment with =
emitting the doc in your map function instead ...
>=20
>> The total amount of data returned from the request is 1467 bytes.
>=20
> ... especially when the documents are this small.

Sure, but I would have expected that to only really help if the system =
was contending for resources? I am using linked docs so not sure about =
emitting the entire doc in the view.

<snip>

>>=20
>>=20
>> Before coming here to question my findings I took a 3rd box (same =
spec) and built couch from the tip of the 1.1.x branch (rev 1040477). =
After compiling couch and installing it I found that it didn't start up =
(or log anything useful). After a bit of digging I figured it's probably =
due to the age of the erlang version being used - I upgraded to OTP R14B =
and rebuilt couch against it. This gave me a working install again.
>=20
> Hmm, I've heard that we did something to break compatibility with =
12B-5 recently.  We should either fix it or bump the required version.  =
Thanks for the note.

COUCHDB-856?

>=20
>> I got an immediate throughput increase to ~500 requests/s which was =
nice but the data being collected via sadc still showed that the cpu was =
at most 20% utilised and the disk controller was doing next to nothing =
(I assume the OS cache already has the data requested so no trip to disk =
required?)
>>=20
>> At this point I started to wonder if jmeter is unable to send in =
enough requests to stress couch so I started up another jmeter instance =
on another box and had it also send in requests to couch. What i noticed =
was that the total throughput didn't increase - it was just split over =
both jmeter instances.
>=20
> How many concurrent requests are submitted by each jmeter instance?

25.

<snip>

>=20
> Do you know if the CPU load was spread across cores or concentrated on =
a single one?  One thing Kenneth did not mention in that thread is that =
you can now bind Erlang schedulers to specific cores.  By default the =
schedulers are unbound; maybe RHEL is doing a poor job of distributing =
them.  You can bind them using the default strategy for your CPUs by =
starting the VM with the "+sbt db" option.

It was using most of 2 cores. I had a go with "+sbt db" and it didn't =
perform as well as "-S 16:2".

WRT disabling HT - I need to take a trip to the datacentre to disable HT =
in the bios but I tried disabling some cores with:

echo 0 > /sys/devices/system/node/nodeX/cpuX/online

Which should stop the kernel seeing the core - not as clean as disabling =
it in the bios but should suffice. /proc/cpuinfo stopped showing the =
cores I removed so it looks like it worked.
Again I didn't see any improvement.

Cheers
Huw=