couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Re: Detailed info on the B-tree store? Native implementations thereof?
Date Wed, 12 Aug 2009 00:03:53 GMT

On Aug 11, 2009, at 3:07 PM, Jens Alfke wrote:

>
> On Aug 11, 2009, at 10:37 AM, Chris Anderson wrote:
>
>> Since this article, we've changed the header handling, so that we
>> don't keep it at the top of the file, but instead append the header  
>> at
>> the end of the file at every commit. The strict append-only nature of
>> the storage engine is the source of it's robustness. Even an extreme
>> action, like truncating the file, will not result in an inconsistent
>> state.
>
> Interesting. Does this really guarantee file integrity even in the  
> case of power failure? (I have some experience dealing with file  
> corruption, from working on Mac OS X components that use sqlite.)

CouchDB is completely protected from improper shutdowns (so long as  
the filesystem doesn't corrupt already fsync'd data), even if the file  
gets truncated, a very common occurrence when you run out of disk space.

> The worst problem is that the disk controller will reorder sector  
> writes to reduce seek time, which in effect means that if power is  
> lost, some random subset of the last writes may not happen. So you  
> won't just end up with a truncated file — you could have a file that  
> seems intact and has a correct header at the end, but has 4k bytes  
> of garbage somewhere within the last transaction. Does CouchDB's  
> file structure guard against that?

First we fsync all the data and indexes, then we write and fsync the  
headers in a separate step.

-Damien

>
> My concern with HTML5 local storage is that it's going to be used  
> for important user data that cannot be lost, just the way native  
> apps put irreplacable data in local files. But the data stores being  
> used to implement local storage are much less resilient than the  
> filesystem itself. My experience with sqlite is that heavily-used  
> databases on consumer machines get corrupted and lost every few  
> months.( This isn't directly related to CouchDB itself; but it's why  
> I'm interested in the fault-tolerant data store it uses.)
>
>> The other aspect our API that web storage will need to be
>> concurrency-friendly is MVCC. Without MVCC you end up needing long
>> transactions between page-loads, like localStorage currently has,
>> which makes it useless for sharing state between windows.
>
> I'm still not 100% convinced by your analysis in that blog post. A  
> script running in a web page will implicitly acquire a lock when it  
> accesses local storage, and release the lock at the end of the  
> current event that it's handling (i.e. a user action or XHR  
> response.) This is sufficiently fine-grained as to not pose a  
> problem, I think.
>
> But Jeremy Orlow pointed out a more problematic case to me — the  
> HML5 worker-thread API. Worker threads should be able to access  
> local storage, and they don't have an event-based model; so a worker  
> thread will probably be within some internal 'while' loop during its  
> entire lifespan. There is thus no way to automatically handle  
> transactions for it, so it will have to manually acquire and release  
> locks. That means that a buggy or blocked worker thread could starve  
> web pages in the same domain from accessing local storage. That's bad.
>
>> Maybe the easiest thing would be to just start bundling CouchDB with
>> your browser. :)
>
> In a lot of ways that would be really awesome. However, it would  
> have a terrible effect on the download size of the browser, which is  
> an important consideration. (IIRC, the all-in-one double-clickable  
> Mac CouchDB package is something like 15MB.)
>
> I like the idea, which I think you proposed, of putting a basic b- 
> tree API into the browser, and being able to implement a lite  
> storage system compatible with CouchDB on top of it in JS.
>
> —Jens


Mime
View raw message