On 12 Aug 2009, at 01:08, Robert Newson wrote: >> The worst problem is that the disk controller will reorder sector >> writes to reduce seek time, which in effect means that if power is >> lost, some random subset of the last writes may not happen. So you >> won't just end up with a truncated file > > But what about that issue? It think it's tolerated because couchdb > searches backward for the last non-corrupt header, right? > > find_header(_Fd, -1) -> > no_valid_header; > find_header(Fd, Block) -> > case (catch load_header(Fd, Block)) of > {ok, Bin} -> > {ok, Bin}; > _Error -> > find_header(Fd, Block -1) > end. Jens' next sentence was: > So you won't just end up with a truncated file — you could have a > file that seems intact and has a correct header at the end, but has > 4k bytes of garbage somewhere within the last transaction. Does > CouchDB's file structure guard against that? As far as I understand the file format, we're not safe against that. Cheers Jan -- > On Tue, Aug 11, 2009 at 6:37 PM, Chris Anderson > wrote: >> On Tue, Aug 11, 2009 at 8:43 AM, Jens Alfke >> wrote: >>> I'm interested in the underpinnings of the CouchDB server — the >>> crash-proof >>> concurrent B-tree store. There's a blog post linked to in the wiki >>> that >>> describes the basic concepts (leaves and updated intermediate >>> nodes are >>> appended to the file; the start of the file stores two links to >>> the root >>> node) but is there any more detailed description[1]? And is there >>> any >>> similar technology available that's implemented in native code (C/C >>> ++)?[2] >>> >> >> Jens, >> >> Glad to have you interested. There are a few posts about the B-tree >> store. This is probably the best, but slightly out of date: >> >> http://horicky.blogspot.com/2008/10/couchdb-implementation.html >> >> Since this article, we've changed the header handling, so that we >> don't keep it at the top of the file, but instead append the header >> at >> the end of the file at every commit. The strict append-only nature of >> the storage engine is the source of it's robustness. Even an extreme >> action, like truncating the file, will not result in an inconsistent >> state. >> >> The other aspect our API that web storage will need to be >> concurrency-friendly is MVCC. Without MVCC you end up needing long >> transactions between page-loads, like localStorage currently has, >> which makes it useless for sharing state between windows. As I >> analyzed in that blog post, once you have CouchDB-style MVCC tokens, >> you pretty much need to start dealing in documents to manage the {id, >> rev} tuple. >> >> Maybe the easiest thing would be to just start bundling CouchDB with >> your browser. :) >> >> I'll be living in Berkeley starting next month, so if you'd like to >> get together perhaps I can help get you oriented in the source code >> so >> you can see this stuff in action, yourself. Erlang is surprisingly >> simple once you get started. >> >> Chris >> >> >>> Basically I'm interested in whether it's feasible to build a >>> simple storage >>> system (for use in an HTML5 Web browser) that a CouchDB-compatible >>> client >>> library could be built on top of. JChris has posted about this topic >>> recently[3], and pointed out that the hashtable-oriented key-value >>> store >>> currently speced in HTML5 is a poor match for CouchDB. Moreover, >>> the SQLite >>> database engine underneath it doesn't guarantee data integrity >>> after a hard >>> system crash (as I know from painful experience.) So: could we >>> build a >>> fault-tolerant B-tree based API into the browser? (This isn't just >>> academic >>> curiosity: I recently started work on the Chrome team at Google, >>> and HTML5 >>> local storage is one of my group's responsibilities.) >>> >>> Thanks! >>> >>> —Jens >>> >>> [1] Alas, I cannot Use The Source, Luke, as I do not have Erlang >>> skillz. :( >>> [2] I know of many, many B-tree libraries (Berkeley DB, >>> TokyoCabinet...) but >>> none that are fault-tolerant. >>> [3] http://jchrisa.net/drl/_design/sofa/_show/post/Fixing-HTML-5-Storage >> >> >> >> -- >> Chris Anderson >> http://jchrisa.net >> http://couch.io >> >