Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 19600 invoked from network); 11 Aug 2009 18:20:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 11 Aug 2009 18:20:06 -0000 Received: (qmail 40775 invoked by uid 500); 11 Aug 2009 18:20:12 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 40712 invoked by uid 500); 11 Aug 2009 18:20:12 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 40701 invoked by uid 99); 11 Aug 2009 18:20:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Aug 2009 18:20:12 +0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.68.5.17] (HELO relay03.pair.com) (209.68.5.17) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 11 Aug 2009 18:20:01 +0000 Received: (qmail 44253 invoked from network); 11 Aug 2009 18:19:37 -0000 Received: from 75.143.234.216 (HELO ?192.168.1.197?) (75.143.234.216) by relay03.pair.com with SMTP; 11 Aug 2009 18:19:37 -0000 X-pair-Authenticated: 75.143.234.216 Message-Id: <3483AAEF-F25C-4BC4-8427-AB49E0856A5A@apache.org> From: Damien Katz To: dev@couchdb.apache.org In-Reply-To: Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v936) Subject: Re: Detailed info on the B-tree store? Native implementations thereof? Date: Tue, 11 Aug 2009 14:19:36 -0400 References: <5B24687D-DB8B-4FB4-B738-D3A19C426846@mooseyard.com> X-Mailer: Apple Mail (2.936) X-Virus-Checked: Checked by ClamAV on apache.org On Aug 11, 2009, at 1:37 PM, Chris Anderson wrote: > On Tue, Aug 11, 2009 at 8:43 AM, Jens Alfke wrote: >> I'm interested in the underpinnings of the CouchDB server =97 the =20 >> crash-proof >> concurrent B-tree store. There's a blog post linked to in the wiki =20= >> that >> describes the basic concepts (leaves and updated intermediate nodes =20= >> are >> appended to the file; the start of the file stores two links to the =20= >> root >> node) but is there any more detailed description[1]? And is there any >> similar technology available that's implemented in native code (C/C+=20= >> +)?[2] I don't know of anything offhand that works quite like CouchDB. I =20 think BerkleyDB and PostgreSQL shares some similar traits, but I don't =20= know the details how their internals work. >> > > Jens, > > Glad to have you interested. There are a few posts about the B-tree > store. This is probably the best, but slightly out of date: > > http://horicky.blogspot.com/2008/10/couchdb-implementation.html Here are some more: = http://ayende.com/Blog/archive/2008/09/24/more-couchdb-reading-btreelookup= .aspx = http://ayende.com/Blog/archive/2008/09/24/more-couchdb-reading-btreequery_= modify.aspx = http://ayende.com/Blog/archive/2008/10/04/erlang-reading-couchdb-digging-d= own-to-disk.aspx > > Since this article, we've changed the header handling, so that we > don't keep it at the top of the file, but instead append the header at > the end of the file at every commit. The strict append-only nature of > the storage engine is the source of it's robustness. Even an extreme > action, like truncating the file, will not result in an inconsistent > state. > > The other aspect our API that web storage will need to be > concurrency-friendly is MVCC. Without MVCC you end up needing long > transactions between page-loads, like localStorage currently has, > which makes it useless for sharing state between windows. I'm confused by this. MVCC isn't necessary to build a CouchDB =20 compatible store, simple global lock per db operation can work fine in =20= a browser context. > As I > analyzed in that blog post, once you have CouchDB-style MVCC tokens, > you pretty much need to start dealing in documents to manage the {id, > rev} tuple. Oh, I think you mean the revision system. That works well with the =20 internal MVCC storage engine, but MVCC is not a requirement. > > Maybe the easiest thing would be to just start bundling CouchDB with > your browser. :) Agreed :) > > I'll be living in Berkeley starting next month, so if you'd like to > get together perhaps I can help get you oriented in the source code so > you can see this stuff in action, yourself. Erlang is surprisingly > simple once you get started. > > Chris > > >> Basically I'm interested in whether it's feasible to build a simple =20= >> storage >> system (for use in an HTML5 Web browser) that a CouchDB-compatible =20= >> client >> library could be built on top of. JChris has posted about this topic >> recently[3], and pointed out that the hashtable-oriented key-value =20= >> store >> currently speced in HTML5 is a poor match for CouchDB. Moreover, =20 >> the SQLite >> database engine underneath it doesn't guarantee data integrity =20 >> after a hard >> system crash (as I know from painful experience.) So: could we =20 >> build a >> fault-tolerant B-tree based API into the browser? (This isn't just =20= >> academic >> curiosity: I recently started work on the Chrome team at Google, =20 >> and HTML5 >> local storage is one of my group's responsibilities.) Of course its possible, it's just software :) But there is a lot to =20 the storage engine in CouchDB, 2 different btrees (by_name and by_seq) =20= and document and attachment storage with online compaction and =20 revision tracking, conflict management etc, and thats not to mention =20 the workings of the view engine. Getting something works like like =20 CouchDB is going to be big task, but I wholeheartedly encourage it. In the meantime, just embed CouchDB as Chrome subprocess. >> >> Thanks! >> >> =97Jens >> >> [1] Alas, I cannot Use The Source, Luke, as I do not have Erlang =20 >> skillz. :( >> [2] I know of many, many B-tree libraries (Berkeley DB, =20 >> TokyoCabinet...) but >> none that are fault-tolerant. >> [3] = http://jchrisa.net/drl/_design/sofa/_show/post/Fixing-HTML-5-Storage > > > > --=20 > Chris Anderson > http://jchrisa.net > http://couch.io