couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <>
Subject Re: Detailed info on the B-tree store? Native implementations thereof?
Date Tue, 11 Aug 2009 19:39:47 GMT
On Tue, Aug 11, 2009 at 12:07 PM, Jens Alfke<> wrote:
> On Aug 11, 2009, at 10:37 AM, Chris Anderson wrote:
>> Since this article, we've changed the header handling, so that we
>> don't keep it at the top of the file, but instead append the header at
>> the end of the file at every commit. The strict append-only nature of
>> the storage engine is the source of it's robustness. Even an extreme
>> action, like truncating the file, will not result in an inconsistent
>> state.
> Interesting. Does this really guarantee file integrity even in the case of
> power failure? (I have some experience dealing with file corruption, from
> working on Mac OS X components that use sqlite.) The worst problem is that
> the disk controller will reorder sector writes to reduce seek time, which in
> effect means that if power is lost, some random subset of the last writes
> may not happen. So you won't just end up with a truncated file — you could
> have a file that seems intact and has a correct header at the end, but has
> 4k bytes of garbage somewhere within the last transaction. Does CouchDB's
> file structure guard against that?

We haven't done much in the way of platform specific hacks (aside from
the real-force fsync on os x). So we do have some assumptions (the
main one being that the last bytes of a commit will be written last).

The deterministic revids help with detecting bad commits later,
although that's not exactly the answer to your question. Details here:

To keep your data safe, ideally your local database is replicating to
the cloud all the time, anyway.

> My concern with HTML5 local storage is that it's going to be used for
> important user data that cannot be lost, just the way native apps put
> irreplacable data in local files. But the data stores being used to
> implement local storage are much less resilient than the filesystem itself.
> My experience with sqlite is that heavily-used databases on consumer
> machines get corrupted and lost every few months.( This isn't directly
> related to CouchDB itself; but it's why I'm interested in the fault-tolerant
> data store it uses.)
>> The other aspect our API that web storage will need to be
>> concurrency-friendly is MVCC. Without MVCC you end up needing long
>> transactions between page-loads, like localStorage currently has,
>> which makes it useless for sharing state between windows.
> I'm still not 100% convinced by your analysis in that blog post. A script
> running in a web page will implicitly acquire a lock when it accesses local
> storage, and release the lock at the end of the current event that it's
> handling (i.e. a user action or XHR response.) This is sufficiently
> fine-grained as to not pose a problem, I think.
> But Jeremy Orlow pointed out a more problematic case to me — the HML5
> worker-thread API. Worker threads should be able to access local storage,
> and they don't have an event-based model; so a worker thread will probably
> be within some internal 'while' loop during its entire lifespan. There is
> thus no way to automatically handle transactions for it, so it will have to
> manually acquire and release locks. That means that a buggy or blocked
> worker thread could starve web pages in the same domain from accessing local
> storage. That's bad.

Yes, I suppose the single threaded page is not so much a challenge as
the worker threads. But we've got to assume that worker threads will
be getting more popular, someday they may even be running on remote
machines. When designing for concurrency, it's better to plan for more
concurrency, not less.

We could run a local couchdb API inside a worker thread, which would
have a "protected' localStorage instance. In this case the worker
thread could handle serializing access to it, and enforce MVCC, even
if the underlying storage engine doesn't. However, doing it this way
would probably serialize reads as well as writes. That'd be a shame as
the appeal of MVCC is that it is friendly to parallel readers, while
enforcing write consistency.

>> Maybe the easiest thing would be to just start bundling CouchDB with
>> your browser. :)
> In a lot of ways that would be really awesome. However, it would have a
> terrible effect on the download size of the browser, which is an important
> consideration. (IIRC, the all-in-one double-clickable Mac CouchDB package is
> something like 15MB.)

The Ubuntu idea to run it as a system service helps with these
problems, but I realize isn't really scoped nicely for browsers.

> I like the idea, which I think you proposed, of putting a basic b-tree API
> into the browser, and being able to implement a lite storage system
> compatible with CouchDB on top of it in JS.
> —Jens

Yes, I agree this is probably the pragmatic middle ground.


Chris Anderson

View raw message