couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Cheng <johnlich...@gmail.com>
Subject Re: CouchDB performance
Date Wed, 17 Aug 2011 00:08:23 GMT
> Just to follow up on this. I've pushed a test case written in Scala to
> Github (https://github.com/jlcheng/couchdb-test). With this test case, I am
> using Apache HttpClient to perform testing. Since I am more familiar with
> Java, I was able to optimize the test case more. With Scala and CouchDB
> 0.10.0 (default install from Ubuntu 10.04 LTS), I am seeing 94 inserts per
> second, document size of 100k (using batch=ok). I think this pretty close to
> being optimized.
> However, going to a custom compiled CouchDB 1.1.0, I am seeing a mere 20
> inserts per second. I also see disk activity drop from ~7mb writes per
> second to ~2mb/sec. There must be something wrong with the way I have it
> compiled or configured. Any ideas where I should start?
>
By luck, I noticed that Apache provides an async HttpClient library.
By switching to the async library, where the code does not wait on
shared HTTP connections, I was able to get more reasonable performance
out of CouchDB 1.1.0. Now I am seeing 200 inserts/sec at 100 kb doc
size, and my CPU and disks are both heavily utilized.

This makes me wonder what changed between 0.10.0 and 1.1.0 that
requires such a change in how clients should access CouchDB. Why did
the traditional HttpClient code get only 20 inserts per second? Where
was the bottleneck? It also points out that client code for CouchDB
access may not be easy to implement (I'm not familiar with the
equivalent of java.nio API in Python, for example). It makes me
hesitant to recommend using Python with CouchDB right now.

Mime
View raw message