Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 6910 invoked from network); 2 Dec 2010 14:42:30 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Dec 2010 14:42:30 -0000 Received: (qmail 78446 invoked by uid 500); 2 Dec 2010 14:42:29 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 78305 invoked by uid 500); 2 Dec 2010 14:42:28 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 78296 invoked by uid 99); 2 Dec 2010 14:42:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Dec 2010 14:42:27 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adam.kocoloski@gmail.com designates 209.85.216.180 as permitted sender) Received: from [209.85.216.180] (HELO mail-qy0-f180.google.com) (209.85.216.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Dec 2010 14:42:20 +0000 Received: by qyk29 with SMTP id 29so8658962qyk.11 for ; Thu, 02 Dec 2010 06:41:59 -0800 (PST) Received: by 10.224.2.139 with SMTP id 11mr109136qaj.176.1291300918968; Thu, 02 Dec 2010 06:41:58 -0800 (PST) Received: from [10.1.10.164] (c-66-31-20-188.hsd1.ma.comcast.net [66.31.20.188]) by mx.google.com with ESMTPS id t35sm405852qco.30.2010.12.02.06.41.56 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 02 Dec 2010 06:41:57 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) Subject: Re: Read request throughput From: Adam Kocoloski In-Reply-To: Date: Thu, 2 Dec 2010 09:41:55 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <52216727-630A-49F6-B919-E691007F1361@netdev.co.uk> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1082) On Dec 2, 2010, at 6:29 AM, Huw Selley wrote: >> include_docs=3Dtrue is definitely more work at read time than = embedding the docs in the view index. I'm not sure about your = application design constraints, but given that your database and index = seem to fit entirely in RAM at the moment you could experiment with = emitting the doc in your map function instead ... >>=20 >>> The total amount of data returned from the request is 1467 bytes. >>=20 >> ... especially when the documents are this small. >=20 > Sure, but I would have expected that to only really help if the system = was contending for resources? I am using linked docs so not sure about = emitting the entire doc in the view. Didn't realize you were using linked docs. You're certainly right, = there's no way to emit those directly. >> Hmm, I've heard that we did something to break compatibility with = 12B-5 recently. We should either fix it or bump the required version. = Thanks for the note. >=20 > COUCHDB-856? Ah, right. That one was my fault. But Filipe fixed it in r1034380, so = it shouldn't have caused you any trouble here. >> Do you know if the CPU load was spread across cores or concentrated = on a single one? One thing Kenneth did not mention in that thread is = that you can now bind Erlang schedulers to specific cores. By default = the schedulers are unbound; maybe RHEL is doing a poor job of = distributing them. You can bind them using the default strategy for = your CPUs by starting the VM with the "+sbt db" option. >=20 > It was using most of 2 cores. I had a go with "+sbt db" and it didn't = perform as well as "-S 16:2". >=20 > WRT disabling HT - I need to take a trip to the datacentre to disable = HT in the bios but I tried disabling some cores with: >=20 > echo 0 > /sys/devices/system/node/nodeX/cpuX/online >=20 > Which should stop the kernel seeing the core - not as clean as = disabling it in the bios but should suffice. /proc/cpuinfo stopped = showing the cores I removed so it looks like it worked. > Again I didn't see any improvement. Ok, interesting. When you request an up-to-date view there are = basically 7 Erlang processes involved: one HTTP connection handler, two = couch_file servers (one for .couch and one for .view), a couch_db = server, a couch_view_group server, and then two registered processes = (couch_server and couch_view). When you send additional concurrent = requests for the same view CouchDB spawns off additional HTTP handlers = to do things like JSON encoding and header processing, but these other = six processes just need to handle the additional load themselves. The fact that you only saw two cores regularly used suggests that one of = these processes turned into a bottleneck (and when they weren't blocked, = the other processes ran on the second core). My guess would be the DB = couch_file, since every view request was hitting it multiple times: once = to open the ddoc and N times to load the linked documents. But that's = just a guess. I'm mildly surprised that you see a significant gain from = dropping down to 2 active schedulers, and it's not a mode of operation I = would recommend if you plan to have multiple active databases. But I = can see where it might help this particular benchmark a bit. This is the first time I've seen someone try to maximize the throughput = for this particular type of request, so I don't have any more bright = suggestions. If I'm right about the cause of the bottleneck I can think = of new optimizations we might add to reduce it in the future, but = nothing in terms of tweaks to the server config. Regards, Adam >=20 > Cheers > Huw