Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 88ABC99C1 for ; Mon, 5 Mar 2012 11:23:28 +0000 (UTC) Received: (qmail 56754 invoked by uid 500); 5 Mar 2012 11:23:28 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 56542 invoked by uid 500); 5 Mar 2012 11:23:28 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 56534 invoked by uid 99); 5 Mar 2012 11:23:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Mar 2012 11:23:28 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dionne@dionne-associates.com designates 69.89.24.6 as permitted sender) Received: from [69.89.24.6] (HELO oproxy9.bluehost.com) (69.89.24.6) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 05 Mar 2012 11:23:21 +0000 Received: (qmail 2502 invoked by uid 0); 5 Mar 2012 11:23:00 -0000 Received: from unknown (HELO host183.hostmonster.com) (74.220.207.183) by oproxy9.bluehost.com with SMTP; 5 Mar 2012 11:23:00 -0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dionne-associates.com; s=default; h=To:References:Message-Id:Content-Transfer-Encoding:Date:In-Reply-To:From:Subject:Mime-Version:Content-Type; bh=mI+41vURExi3KsMrXZbJtY50KQO6Juo1QH92TgRg1m0=; b=foOZHjAaax6/BA95100s+Ts9dfvlhn/vh7YDDjrMWy0a1JXZjayBzHZQLCaH5uAhi/4E1npnAVNKiGbDZQ8//jr2XJloGWs1wEmNfD23KfX4MGEG0nRZ9RDt/4y0Iu3V; Received: from adsl-99-103-108-9.dsl.wlfrct.sbcglobal.net ([99.103.108.9] helo=[192.168.1.115]) by host183.hostmonster.com with esmtpa (Exim 4.76) (envelope-from ) id 1S4W0K-0001jb-8z for dev@couchdb.apache.org; Mon, 05 Mar 2012 04:23:00 -0700 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1257) Subject: Re: Please report your indexing speed From: Bob Dionne In-Reply-To: Date: Mon, 5 Mar 2012 06:23:00 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <6D224249-142D-463C-9368-665CACDE509B@dionne-associates.com> References: <3D424332-FAA6-4F2F-9DB8-05E1464553C5@dionne-associates.com> <15DA24C4-8AC9-4A36-9727-176B3BE02504@dionne-associates.com> To: dev@couchdb.apache.org X-Mailer: Apple Mail (2.1257) X-Identified-User: {2551:host183.hostmonster.com:dionneas:dionne-associates.com} {sentby:smtp auth 99.103.108.9 authed with dionne@dionne-associates.com} X-Virus-Checked: Checked by ClamAV on apache.org Awesome Filipe, so these two were related, I didn't get that subtlety in = your original post. This is great, thanks for the patch -- Bob On Mar 5, 2012, at 2:41 AM, Filipe David Manana wrote: > On Sun, Mar 4, 2012 at 9:45 AM, Bob Dionne = wrote: >> yes, I was surprised by the 30% claim as my numbers showed it only = getting back to where we were with 1.1.x >>=20 >> I think Bob's suggestion to get to the root code change that caused = this regression is important as it will help us assess all the other = cases this testing hasn't even touched yet >=20 > The explanation I gave in the 1.2.0 second round vote identifies the > reason, which is that the updater is (depending on timings) collecting > smaller batches of map results, which makes the btree updates less > efficient (besides higher number of btree updates). The patch > addresses this by queuing a batch of map results instead of queuing > map results one by one. Jan's tests and mine are evidence that this is > valid in practice and not just theory. >=20 > The original main goal of COUCHDB-1186 was to make the indexing of > views that emit reasonably large (or complex in structure) map values > more efficient. > Here's an example using Jason's slow_couchdb script with wow.tpl and > map function of "function(doc) {emit([doc.type, doc.category], > doc);}": >=20 > 1.1.x: >=20 > fdmanana 07:04:12 ~/git/hub/slow_couchdb (master)> docs=3D200000 > batch=3D5000 ./bench.sh wow.tpl > Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03) > {"couchdb":"Welcome","version":"1.1.2a785d32f-git"} >=20 > [INFO] Created DB named `db1' > [INFO] Uploaded 5000 documents via _bulk_docs > (....) > [INFO] Uploaded 5000 documents via _bulk_docs > Building view. > {"total_rows":200000,"offset":0,"rows":[ > = {"id":"00144af5-9f07-448e-af88-026674e3e3d0","key":["dwarf","assassin"],"v= alue":{"_id":"00144af5-9f07-448e-af88-026674e3e3d0","_rev":"1-785fbf5e641f= 3d10fa083501ad82a9fe","data3":"Vl6BftQEWY6imvNs0FasOj2CrPCptP70z5d","ratio= ":1.6,"integers":[48028,3170,54066,95547,70643,23763,25804,33180,89061,352= 74,48244,91792,37936,11855],"category":"assassin","nested":{"dict":{"3XGVd= TTF":31490,"SDxKa54e":40,"XIzUloRH":7,"5Mj9F4bp":192,"1sXfjgYf":1203,"XP5Y= SqhX":25461,"QJr0Xhxn":9941},"string1":"3Q4tvmhHwKvedKiRnoL6xUz","string2"= :"dWI1mrrAypRh","values":[33712,57371,88567,88361,70873,6327,17326,91004,4= 1840,86257],"string3":"i7OGysnXvynz41VMQJ","coords":[{"x":65350.46,"y":103= 881.18},{"x":24180.14,"y":8474.9},{"x":88326.66,"y":43151.76},{"x":120199.= 77,"y":102270.29},{"x":191924.18,"y":74479.75}]},"level":21,"type":"dwarf"= ,"data1":"Vpkplo80LshlcjBE0ySJNNpfgDy2bu8byWrmZ44B","data2":"GnyNbos75Wxm1= C5MLdOeXNniHamBjld70vHqoJnEtnlfekuPXJ"}} > ]} >=20 > real 2m49.227s > user 0m0.006s > sys 0m0.005s >=20 >=20 > 1.2.x: >=20 > fdmanana 07:13:30 ~/git/hub/slow_couchdb (master)> docs=3D200000 > batch=3D5000 ./bench.sh wow.tpl > Server: CouchDB/1.2.0 (Erlang OTP/R14B03) > {"couchdb":"Welcome","version":"1.2.0"} >=20 > [INFO] Created DB named `db1' > [INFO] Uploaded 5000 documents via _bulk_docs > (....) > [INFO] Uploaded 5000 documents via _bulk_docs > Building view. > {"total_rows":200000,"offset":0,"rows":[ > = {"id":"0005cd07-49f4-4a99-b506-acef948f2acc","key":["dwarf","assassin"],"v= alue":{"_id":"0005cd07-49f4-4a99-b506-acef948f2acc","_rev":"1-4b418e69618b= f11124a03e1a3845f071","data3":"T0W2JBUET9yzRXHfUqcUBwFhYGKh14YFVxk","ratio= ":1.6999999999999999556,"integers":[25658,7573,47779,43217,49586,57992,135= 49,90984,45253,49560,1643,64085,38381,62544],"category":"assassin","nested= ":{"dict":{"oWhW4jJ6":199,"EPSVtKtS":5638,"8WpzvD5x":73714,"stD9Ynfh":8924= ,"0qh5Nc1g":5994,"pBa5vJyy":18,"s25oAkRc":165270},"string1":"fNNHb8lxtcy7G= pwSU3yRyaC","string2":"rilbiZM7yAaK","values":[49632,93665,73258,75675,422= 9,15742,16965,76825,22049,79829],"string3":"IwX09SiOLMSSyxffMB","coords":[= {"x":179620.45000000001164,"y":11539.989999999999782},{"x":68483.820000000= 006985,"y":110559.19999999999709},{"x":67197.940000000002328,"y":96702.210= 000000006403},{"x":25469.869999999998981,"y":79049.490000000005239},{"x":1= 57059.89999999999418,"y":34963.410000000003492}]},"level":6,"type":"dwarf"= ,"data1":"njpz38JSfz00p2Lc2Jv0dON7UfTljRgz0J2B7w7K","data2":"4hpsT2szDrssb= UCHEirTzHOIhSxTd83i1FO5aNXgoGAfO2srH1"}} > ]} >=20 > real 1m51.989s > user 0m0.006s > sys 0m0.004s >=20 >=20 > 1.2.x + patch: >=20 > fdmanana 07:29:11 ~/git/hub/slow_couchdb (master)> docs=3D200000 > batch=3D5000 ./bench.sh wow.tpl > Server: CouchDB/1.2.0 (Erlang OTP/R14B03) > {"couchdb":"Welcome","version":"1.2.0"} >=20 > [INFO] Created DB named `db1' > [INFO] Uploaded 5000 documents via _bulk_docs > (....) > [INFO] Uploaded 5000 documents via _bulk_docs > Building view. > {"total_rows":200000,"offset":0,"rows":[ > = {"id":"0005cd07-49f4-4a99-b506-acef948f2acc","key":["dwarf","assassin"],"v= alue":{"_id":"0005cd07-49f4-4a99-b506-acef948f2acc","_rev":"1-4b418e69618b= f11124a03e1a3845f071","data3":"T0W2JBUET9yzRXHfUqcUBwFhYGKh14YFVxk","ratio= ":1.6999999999999999556,"integers":[25658,7573,47779,43217,49586,57992,135= 49,90984,45253,49560,1643,64085,38381,62544],"category":"assassin","nested= ":{"dict":{"oWhW4jJ6":199,"EPSVtKtS":5638,"8WpzvD5x":73714,"stD9Ynfh":8924= ,"0qh5Nc1g":5994,"pBa5vJyy":18,"s25oAkRc":165270},"string1":"fNNHb8lxtcy7G= pwSU3yRyaC","string2":"rilbiZM7yAaK","values":[49632,93665,73258,75675,422= 9,15742,16965,76825,22049,79829],"string3":"IwX09SiOLMSSyxffMB","coords":[= {"x":179620.45000000001164,"y":11539.989999999999782},{"x":68483.820000000= 006985,"y":110559.19999999999709},{"x":67197.940000000002328,"y":96702.210= 000000006403},{"x":25469.869999999998981,"y":79049.490000000005239},{"x":1= 57059.89999999999418,"y":34963.410000000003492}]},"level":6,"type":"dwarf"= ,"data1":"njpz38JSfz00p2Lc2Jv0dON7UfTljRgz0J2B7w7K","data2":"4hpsT2szDrssb= UCHEirTzHOIhSxTd83i1FO5aNXgoGAfO2srH1"}} > ]} >=20 > real 1m45.573s > user 0m0.006s > sys 0m0.004s >=20 >=20 > Unless someone comes up with scenarios where 1.2.x with the patch is > significantly slower than 1.1.x, I think we should close this and move > to release 1.2.0. >=20 > Thanks all for the testing. >=20 >>=20 >> On Mar 3, 2012, at 5:25 PM, Bob Dionne wrote: >>=20 >>> I ran some tests, using Bob's latest script. The first versus the = second clearly show the regression. The third is the 1.2.x with the = patch >>> to couch_os_process reverted and it seems to have no impact. The = last has Filipe's latest patch to couch_view_updater discussed in the >>> other thread and it brings the performance back to the 1.1.x level. >>>=20 >>> I'd say that patch is a +1 >>>=20 >>> 1.2.x >>> real 3m3.093s >>> user 0m0.028s >>> sys 0m0.008s >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> 1.1.x >>> real 2m16.609s >>> user 0m0.026s >>> sys 0m0.007s >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> 1.2.x with patch to couch_os_process reverted >>> real 3m7.012s >>> user 0m0.029s >>> sys 0m0.008s >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> 1.2.x with Filipe's katest patch to couch_view_updater >>> real 2m11.038s >>> user 0m0.028s >>> sys 0m0.007s >>> On Feb 28, 2012, at 8:17 AM, Jason Smith wrote: >>>=20 >>>> Forgive the clean new thread. Hopefully it will not remain so. >>>>=20 >>>> If you can, would you please clone = https://github.com/jhs/slow_couchdb >>>>=20 >>>> And build whatever Erlangs and CouchDB checkouts you see fit, and = run >>>> the test. For example: >>>>=20 >>>> docs=3D500000 ./bench.sh small_doc.tpl >>>>=20 >>>> That should run the test and, God willing, upload the results to a >>>> couch in the cloud. We should be able to use that information to >>>> identify who you are, whether you are on SSD, what Erlang and Couch >>>> build, and how fast it ran. Modulo bugs. >>>=20 >>=20 >=20 >=20 >=20 > --=20 > Filipe David Manana, >=20 > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That's why all progress depends on unreasonable men."