From dev-return-4148-apmail-couchdb-dev-archive=couchdb.apache.org@couchdb.apache.org Mon May 11 20:54:20 2009 Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 4576 invoked from network); 11 May 2009 20:52:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 11 May 2009 20:52:49 -0000 Received: (qmail 14360 invoked by uid 500); 11 May 2009 20:52:48 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 14275 invoked by uid 500); 11 May 2009 20:52:48 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 14265 invoked by uid 99); 11 May 2009 20:52:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 May 2009 20:52:47 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jchris@gmail.com designates 209.85.132.246 as permitted sender) Received: from [209.85.132.246] (HELO an-out-0708.google.com) (209.85.132.246) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 May 2009 20:52:38 +0000 Received: by an-out-0708.google.com with SMTP id b6so1467380ana.5 for ; Mon, 11 May 2009 13:52:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type:content-transfer-encoding; bh=wwbKIWDLlNJTYpPONPJ31JSjm2qDE/kSG2tINNuA4MM=; b=oXAsna7ak67KhbhezE95XpIyqyrfaQe1bMedqjCPrXVovX7O2i8BkM4RyvUz22L6yq tMAIQ6ggw9jINlxx0BFqdr8pJChqBoHcIIEpKp4qrBd7Bqs7jpkHVTYfqMntYfekpydG o64hXK+UOXKaJcZf0C3NtGuk0bC6CT3wojXjc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=rczb96Z97UuHKq64GgaB/XCq+A5mfAVLGs+6YscHFPxErErY3dxQiWCh7Qb4gdTQoD mqBXO1fccYoEEJe9vKNewvyZ5EMQdNLai5qkyAQgTCbPqYtDyVtYmipSqVc/rnwBYPSV tJlmKZWevetcwNWaEvFMmbOii/KAfSV/xzXwI= MIME-Version: 1.0 Sender: jchris@gmail.com Received: by 10.100.134.10 with SMTP id h10mr17687960and.148.1242075136100; Mon, 11 May 2009 13:52:16 -0700 (PDT) In-Reply-To: References: Date: Mon, 11 May 2009 13:52:15 -0700 X-Google-Sender-Auth: 8aabf062d666018c Message-ID: Subject: Re: Patch to couch_btree:chunkify From: Chris Anderson To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Mon, May 11, 2009 at 12:57 PM, Adam Kocoloski wrote: > I'd like to see some more concrete numbers on the performance difference = the > two versions. =A0I wasn't able to reproduce Chris' 10%+ speedup using > hovercraft:lightning; in fact, the two versions seem to be compatible wit= hin > measurement variance. > All I did was run hovercraft:lightning() with and without the patch. Without the patch I was seeing 6.9k doc/sec, with the patch I think it was 7.7k. The fastest I got was with the patch *and* with compression turned off in couch_btree:append_term, when I saw 10.1k docs/sec. I'm running on an old white MacBook, and I seem to be IO bound (with compression turned off). If I turn compression back on, I become CPU bound and end up writing to the disk at a slower rate. These effects would be wildly different on other hardware, I'm guessing. > I tried messing around with fprof for a while today, and if anything it > indicates that the original version might actually be faster (though I fi= nd > that hard to believe). =A0Anyway, I think we should get in the habit of h= aving > some quantitative, reproducible way of evaluating performance-related > patches. > > +1 for Bob's suggestion of stripping out Bt from the arguments, though. > > Adam > > On May 11, 2009, at 3:28 PM, Damien Katz wrote: > >> +1 for committing. >> >> -Damien >> >> >> On May 10, 2009, at 9:49 PM, Paul Davis wrote: >> >>> Chris reminded me that I had an optimization patch laying around for >>> couch_btree:chunkify and his tests show that it gets a bit of a speed >>> increase when running some tests with hovercraft. The basic outline of >>> what I did was to swap a call like term_to_binary([ListOfTuples]) to a >>> sequence of ListOfSizes =3D lists:map(term_to_binary, ListOfTuples), >>> Size =3D sum(ListOfSizes), and then when we go through the list of >>> tuples to split them into chunks I use the pre calculated sizes. >>> >>> Anyway, I just wanted to run it across the list before I commit it in >>> case anyone sees anything subtle I might be missing. >>> >>> chunkify(_Bt, []) -> >>> =A0[]; >>> chunkify(Bt, InList) -> >>> =A0ToSize =3D fun(X) -> size(term_to_binary(X)) end, >>> =A0SizeList =3D lists:map(ToSize, InList), >>> =A0TotalSize =3D lists:sum(SizeList), >>> =A0case TotalSize of >>> =A0Size when Size > ?CHUNK_THRESHOLD -> >>> =A0 =A0 =A0NumberOfChunksLikely =3D ((Size div ?CHUNK_THRESHOLD) + 1), >>> =A0 =A0 =A0ChunkThreshold =3D Size div NumberOfChunksLikely, >>> =A0 =A0 =A0chunkify(Bt, InList, SizeList, ChunkThreshold, [], 0, []); >>> =A0_Else -> >>> =A0 =A0 =A0[InList] >>> =A0end. >>> >>> chunkify(_Bt, [], [], _Threshold, [], 0, Chunks) -> >>> =A0lists:reverse(Chunks); >>> chunkify(_Bt, [], [], _Threshold, OutAcc, _OutAccSize, Chunks) -> >>> =A0lists:reverse([lists:reverse(OutAcc) | Chunks]); >>> chunkify(Bt, [InElement | RestInList], [InSize | RestSizes], Threshold, >>> OutAcc, >>> =A0 =A0 =A0OutAccSize, Chunks) -> >>> =A0case InSize of >>> =A0InSize when (InSize + OutAccSize) > Threshold andalso OutAcc /=3D []= -> >>> =A0 =A0 =A0chunkify(Bt, RestInList, RestSizes, Threshold, [], 0, >>> =A0 =A0 =A0 =A0 =A0[lists:reverse([InElement | OutAcc]) | Chunks]); >>> =A0InSize -> >>> =A0 =A0 =A0chunkify(Bt, RestInList, RestSizes, Threshold, [InElement | = OutAcc], >>> =A0 =A0 =A0 =A0 =A0OutAccSize + InSize, Chunks) >>> =A0end. >> > > --=20 Chris Anderson http://jchrisa.net http://couch.io