couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <>
Subject Re: Detailed info on the B-tree store? Native implementations thereof?
Date Tue, 11 Aug 2009 18:19:36 GMT

On Aug 11, 2009, at 1:37 PM, Chris Anderson wrote:

> On Tue, Aug 11, 2009 at 8:43 AM, Jens Alfke<> wrote:
>> I'm interested in the underpinnings of the CouchDB server — the  
>> crash-proof
>> concurrent B-tree store. There's a blog post linked to in the wiki  
>> that
>> describes the basic concepts (leaves and updated intermediate nodes  
>> are
>> appended to the file; the start of the file stores two links to the  
>> root
>> node) but is there any more detailed description[1]? And is there any
>> similar technology available that's implemented in native code (C/C+ 
>> +)?[2]

I don't know of anything offhand that works quite like CouchDB. I  
think BerkleyDB and PostgreSQL shares some similar traits, but I don't  
know the details how their internals work.

> Jens,
> Glad to have you interested. There are a few posts about the B-tree
> store. This is probably the best, but slightly out of date:

Here are some more:

> Since this article, we've changed the header handling, so that we
> don't keep it at the top of the file, but instead append the header at
> the end of the file at every commit. The strict append-only nature of
> the storage engine is the source of it's robustness. Even an extreme
> action, like truncating the file, will not result in an inconsistent
> state.
> The other aspect our API that web storage will need to be
> concurrency-friendly is MVCC. Without MVCC you end up needing long
> transactions between page-loads, like localStorage currently has,
> which makes it useless for sharing state between windows.

I'm confused by this. MVCC isn't necessary to build a CouchDB  
compatible store, simple global lock per db operation can work fine in  
a browser context.

> As I
> analyzed in that blog post, once you have CouchDB-style MVCC tokens,
> you pretty much need to start dealing in documents to manage the {id,
> rev} tuple.

Oh, I think you mean the revision system. That works well with the  
internal MVCC storage engine, but MVCC is not a requirement.

> Maybe the easiest thing would be to just start bundling CouchDB with
> your browser. :)

Agreed :)

> I'll be living in Berkeley starting next month, so if you'd like to
> get together perhaps I can help get you oriented in the source code so
> you can see this stuff in action, yourself. Erlang is surprisingly
> simple once you get started.
> Chris
>> Basically I'm interested in whether it's feasible to build a simple  
>> storage
>> system (for use in an HTML5 Web browser) that a CouchDB-compatible  
>> client
>> library could be built on top of. JChris has posted about this topic
>> recently[3], and pointed out that the hashtable-oriented key-value  
>> store
>> currently speced in HTML5 is a poor match for CouchDB. Moreover,  
>> the SQLite
>> database engine underneath it doesn't guarantee data integrity  
>> after a hard
>> system crash (as I know from painful experience.) So: could we  
>> build a
>> fault-tolerant B-tree based API into the browser? (This isn't just  
>> academic
>> curiosity: I recently started work on the Chrome team at Google,  
>> and HTML5
>> local storage is one of my group's responsibilities.)

Of course its possible, it's just software :) But there is a lot to  
the storage engine in CouchDB, 2 different btrees (by_name and by_seq)  
and document and attachment storage with online compaction and  
revision tracking, conflict management etc, and thats not to mention  
the workings of the view engine. Getting something works like like  
CouchDB is going to be big task, but I wholeheartedly encourage it.

In the meantime, just embed CouchDB as Chrome subprocess.

>> Thanks!
>> —Jens
>> [1] Alas, I cannot Use The Source, Luke, as I do not have Erlang  
>> skillz. :(
>> [2] I know of many, many B-tree libraries (Berkeley DB,  
>> TokyoCabinet...) but
>> none that are fault-tolerant.
>> [3]
> -- 
> Chris Anderson

View raw message