Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 27583 invoked from network); 26 Feb 2009 19:18:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Feb 2009 19:18:41 -0000 Received: (qmail 22876 invoked by uid 500); 26 Feb 2009 19:18:39 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 22660 invoked by uid 500); 26 Feb 2009 19:18:38 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 22649 invoked by uid 99); 26 Feb 2009 19:18:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Feb 2009 11:18:38 -0800 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.68.5.16] (HELO relay02.pair.com) (209.68.5.16) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 26 Feb 2009 19:18:30 +0000 Received: (qmail 83034 invoked from network); 26 Feb 2009 19:18:07 -0000 Received: from 96.33.90.152 (HELO ?192.168.1.195?) (96.33.90.152) by relay02.pair.com with SMTP; 26 Feb 2009 19:18:07 -0000 X-pair-Authenticated: 96.33.90.152 Message-Id: <4703936A-A1E8-46C0-B2F4-0DA9FD5D5B9A@apache.org> From: Damien Katz To: user@couchdb.apache.org In-Reply-To: <60D10DCD-C1F7-4663-B6DF-60DD4641672C@apache.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?] Date: Thu, 26 Feb 2009 14:18:06 -0500 References: <60D10DCD-C1F7-4663-B6DF-60DD4641672C@apache.org> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On Feb 26, 2009, at 1:55 PM, Jan Lehnardt wrote: > > On 26 Feb 2009, at 19:49, Barry Wark wrote: >>> >>> or ascending... >> >> As an asside, why is it that sequential document ids would produce a >> significant performance boost? I suspect the answer is something >> rather fundamental to CouchDB's design, and I'd like to try to grok >> it. > > b-trees inner-nodes can get cached better if inserts basically always > use the same path. > What he said. It's pretty standard btree stuff, most, if not all the major rdbms have similar issues with primary keys. Also, he Ids don't need to be sequential (1,2,3,4...), just ordered (1,5,19,22...). And they don't need to sort higher or lower than all the other ids, so long as they are clustered together. The each btree nodes that have to be loaded that isn't in cache is expensive. The more the keys have to be inserted into random places in the btree, the worse the caching behavior. Right now, with the crypto random UUIDs we generate, it's basically the worst case scenario for doc inserts. -Damien