From dev-return-4141-apmail-couchdb-dev-archive=couchdb.apache.org@couchdb.apache.org Mon May 11 02:14:12 2009 Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 5376 invoked from network); 11 May 2009 01:57:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 11 May 2009 01:57:16 -0000 Received: (qmail 33616 invoked by uid 500); 11 May 2009 01:49:54 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 33525 invoked by uid 500); 11 May 2009 01:49:54 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 33513 invoked by uid 99); 11 May 2009 01:49:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 May 2009 01:49:54 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of paul.joseph.davis@gmail.com designates 74.125.46.30 as permitted sender) Received: from [74.125.46.30] (HELO yw-out-2324.google.com) (74.125.46.30) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 May 2009 01:49:45 +0000 Received: by yw-out-2324.google.com with SMTP id 2so1285552ywt.5 for ; Sun, 10 May 2009 18:49:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=Rhr+YVp3IhBG3M7tGPyZXgS3bJ1x2W8FF7shCg6qbDw=; b=BZurqgiqBctiPk429gqkWCOzKYfzj/jMkmMRLMZrNIXZHm1It2VsBcWI9UarIhX/Y9 A2yfqLU8lBdwKx4+itw311kHz8hg2+mcdUO8lAFWLMhP51zFCrycreMPTuzdFqPZ7c4P ICQbZNRbhqyuUmGk58FBNYwab6nfPnwjL0E0M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=YGhf4Gzj4dFuJLEH3RI4E6Oq/cMvsDfV5GmjIe9be+y9IXXGwdzyfbpc0y5n6ZdLkn x7kE1rob24LeMvGyhRXY97XYAZoXnloir2l4tE8reoNfdwdcStp5t522s3brc5VNIDq8 Tv/0T8kMTYH59v1bP6lFM6F6/PqIRVQsnPj7Y= MIME-Version: 1.0 Received: by 10.100.231.16 with SMTP id d16mr15630564anh.63.1242006563436; Sun, 10 May 2009 18:49:23 -0700 (PDT) Date: Sun, 10 May 2009 21:49:23 -0400 Message-ID: Subject: Patch to couch_btree:chunkify From: Paul Davis To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Chris reminded me that I had an optimization patch laying around for couch_btree:chunkify and his tests show that it gets a bit of a speed increase when running some tests with hovercraft. The basic outline of what I did was to swap a call like term_to_binary([ListOfTuples]) to a sequence of ListOfSizes = lists:map(term_to_binary, ListOfTuples), Size = sum(ListOfSizes), and then when we go through the list of tuples to split them into chunks I use the pre calculated sizes. Anyway, I just wanted to run it across the list before I commit it in case anyone sees anything subtle I might be missing. chunkify(_Bt, []) -> []; chunkify(Bt, InList) -> ToSize = fun(X) -> size(term_to_binary(X)) end, SizeList = lists:map(ToSize, InList), TotalSize = lists:sum(SizeList), case TotalSize of Size when Size > ?CHUNK_THRESHOLD -> NumberOfChunksLikely = ((Size div ?CHUNK_THRESHOLD) + 1), ChunkThreshold = Size div NumberOfChunksLikely, chunkify(Bt, InList, SizeList, ChunkThreshold, [], 0, []); _Else -> [InList] end. chunkify(_Bt, [], [], _Threshold, [], 0, Chunks) -> lists:reverse(Chunks); chunkify(_Bt, [], [], _Threshold, OutAcc, _OutAccSize, Chunks) -> lists:reverse([lists:reverse(OutAcc) | Chunks]); chunkify(Bt, [InElement | RestInList], [InSize | RestSizes], Threshold, OutAcc, OutAccSize, Chunks) -> case InSize of InSize when (InSize + OutAccSize) > Threshold andalso OutAcc /= [] -> chunkify(Bt, RestInList, RestSizes, Threshold, [], 0, [lists:reverse([InElement | OutAcc]) | Chunks]); InSize -> chunkify(Bt, RestInList, RestSizes, Threshold, [InElement | OutAcc], OutAccSize + InSize, Chunks) end.