couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Please report your indexing speed
Date Sun, 04 Mar 2012 12:10:41 GMT
Excellent stuff, thanks Jan. I think it would be prudent to attempt to
identify which patch, or, more likely, which part of the patch, caused
the 30% regression and why. I will attempt that tomorrow or possibly
later today.

B.

On 4 March 2012 12:03, Bob Dionne <dionne@dionne-associates.com> wrote:
> Great Jan, so this confirms my back of the envelope test using Bob's script and Filipe's
results. The patch is definitely helpful.
>
> I was wondering why no one had looked at test/bench, perhaps this more rigorous approach
could provide the basis for a comprehensive performance tool
>
> On Mar 4, 2012, at 4:24 AM, Jan Lehnardt wrote:
>
>> Hey all,
>>
>> I made another run with a bit of a different scenario.
>>
>>
>> # The Scenario
>>
>> I used a modified benchbulk.sh for inserting data (because it is an order of magnitude
faster than the other methods we had). I added a command line parameter to specify the size
of a single document in bytes (this was previously hardcoded in the script). Note that this
script creates docs in a btree-friendly incrementing ID way.
>>
>> I added a new script benchview.sh which is basically the lower part of Robert Newson's
script. It creates a single view and queries it, measuring execution time of curl.
>>
>> And a third matrix.sh (yay) that would run, on my system, different configurations.
>>
>> See https://gist.github.com/1971611 for the scripts.
>>
>> I ran ./benchbulk $size && ./benchview.sh for the following combinations,
all on Mac OS X 10.7.3, Erlang R15B, Spidermonkey 1.8.5:
>>
>> - Doc sizes 10, 100, 1000 bytes
>> - CouchDB 1.1.1, 1.2.x (as of last night), 1.2.x-filipe (as of last night + Filipe's
patch from earlier in the thread)
>> - On an SSD and on a 5400rpm internal drive.
>>
>> I ran each individual test three times and took the average to compare numbers. The
full report (see below) includes each individual run's numbers)
>>
>> (The gist includes the raw output data from matrix.sh for the 5400rpm run, for the
SSDs, I don't have the original numbers anymore. I'm happy to re-run this, if you want that
data as well.)
>>
>> # The Numbers
>>
>> See https://docs.google.com/spreadsheet/ccc?key=0AhESVUYnc_sQdDJ1Ry1KMTQ5enBDY0s1dHk2UVEzMHc
for the full data set. It'd be great to get a second pair of eyes to make sure I didn't make
any mistakes.
>>
>> See the "Grouped Data" sheet for comparisons.
>>
>> tl;dr: 1.2.x is about 30% slower and 1.2.x-filipe is about 30% faster than 1.1.1
in the scenario above.
>>
>>
>> # Conclusion
>>
>> +1 to include Filipe's patch into 1.2.x.
>>
>>
>>
>> I'd love any feedback on methods, calculations and whatnot :)
>>
>> Also, I can run more variations, if you like, other Erlang or SpiderMokney versions
e.g., just let me know.
>>
>>
>> Cheers
>> Jan
>> --
>>
>> On Feb 28, 2012, at 14:17 , Jason Smith wrote:
>>
>>> Forgive the clean new thread. Hopefully it will not remain so.
>>>
>>> If you can, would you please clone https://github.com/jhs/slow_couchdb
>>>
>>> And build whatever Erlangs and CouchDB checkouts you see fit, and run
>>> the test. For example:
>>>
>>>   docs=500000 ./bench.sh small_doc.tpl
>>>
>>> That should run the test and, God willing, upload the results to a
>>> couch in the cloud. We should be able to use that information to
>>> identify who you are, whether you are on SSD, what Erlang and Couch
>>> build, and how fast it ran. Modulo bugs.
>>
>

Mime
View raw message