couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <>
Subject Re: Detailed info on the B-tree store? Native implementations thereof?
Date Tue, 11 Aug 2009 22:03:20 GMT
On Tue, Aug 11, 2009 at 12:26 PM, Damien Katz<> wrote:
> On Aug 11, 2009, at 3:07 PM, Jens Alfke wrote:
>> On Aug 11, 2009, at 10:37 AM, Chris Anderson wrote:
>>> Since this article, we've changed the header handling, so that we
>>> don't keep it at the top of the file, but instead append the header at
>>> the end of the file at every commit. The strict append-only nature of
>>> the storage engine is the source of it's robustness. Even an extreme
>>> action, like truncating the file, will not result in an inconsistent
>>> state.
>> Interesting. Does this really guarantee file integrity even in the case of
>> power failure? (I have some experience dealing with file corruption, from
>> working on Mac OS X components that use sqlite.) The worst problem is that
>> the disk controller will reorder sector writes to reduce seek time, which in
>> effect means that if power is lost, some random subset of the last writes
>> may not happen. So you won't just end up with a truncated file — you could
>> have a file that seems intact and has a correct header at the end, but has
>> 4k bytes of garbage somewhere within the last transaction. Does CouchDB's
>> file structure guard against that?
>> My concern with HTML5 local storage is that it's going to be used for
>> important user data that cannot be lost, just the way native apps put
>> irreplacable data in local files. But the data stores being used to
>> implement local storage are much less resilient than the filesystem itself.
>> My experience with sqlite is that heavily-used databases on consumer
>> machines get corrupted and lost every few months.( This isn't directly
>> related to CouchDB itself; but it's why I'm interested in the fault-tolerant
>> data store it uses.)
>>> The other aspect our API that web storage will need to be
>>> concurrency-friendly is MVCC. Without MVCC you end up needing long
>>> transactions between page-loads, like localStorage currently has,
>>> which makes it useless for sharing state between windows.
>> I'm still not 100% convinced by your analysis in that blog post. A script
>> running in a web page will implicitly acquire a lock when it accesses local
>> storage, and release the lock at the end of the current event that it's
>> handling (i.e. a user action or XHR response.) This is sufficiently
>> fine-grained as to not pose a problem, I think.
>> But Jeremy Orlow pointed out a more problematic case to me — the HML5
>> worker-thread API. Worker threads should be able to access local storage,
>> and they don't have an event-based model; so a worker thread will probably
>> be within some internal 'while' loop during its entire lifespan. There is
>> thus no way to automatically handle transactions for it, so it will have to
>> manually acquire and release locks. That means that a buggy or blocked
>> worker thread could starve web pages in the same domain from accessing local
>> storage. That's bad.
>>> Maybe the easiest thing would be to just start bundling CouchDB with
>>> your browser. :)
>> In a lot of ways that would be really awesome. However, it would have a
>> terrible effect on the download size of the browser, which is an important
>> consideration. (IIRC, the all-in-one double-clickable Mac CouchDB package is
>> something like 15MB.)
> Things can be made much much smaller. For example it brings in spidermonkey,
> all the Erlang libraries plus the most of the ICU library for collation. If
> we reused the browsers utf and javascript support, I think we could get
> CouchDB + dependencies under a meg compressed.
> -Damien

Well, that's changed my mind. I think the pragmatic thing to do would
be to run CouchDB as a browser subprocess and see what happens.

>> I like the idea, which I think you proposed, of putting a basic b-tree API
>> into the browser, and being able to implement a lite storage system
>> compatible with CouchDB on top of it in JS.
>> —Jens

Chris Anderson

View raw message