couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "ReleaseNotice1.0.0RepairTool" by JanLehnardt
Date Sun, 26 Feb 2012 22:07:38 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "ReleaseNotice1.0.0RepairTool" page has been changed by JanLehnardt:
http://wiki.apache.org/couchdb/ReleaseNotice1.0.0RepairTool

New page:

This is a page where we document the development of the 1.0.0 data loss recovery tool.

Damien says:

> We need fixup code to try to find the lost btree roots (by id and by seq) and add them
to the header and write it.

> I think there should be a fix-up flag in the ini, and when set startup couchdb will scan
all the databases and any that don't have a valid header at the end, it scans backward, looking
for the root of the by_id and by_seq btrees. Then it adds those roots to the first header
it finds and writes it.

> It should attempt to load the roots,  using file:pread_term, stepping back one byte at
a time, looking for the first for the by_seq index root. most of the attempts will be bad
reads or term_to_binary errors and will throw an exception. But some of the reads will produce
a valid term, and it will do a pattern match to check successfully loaded structures for the
proper content to ensure it's the right tree. Then it looks for the by_name root, doing the
same.

> Then using the most recent header, add the root addresses to the header and rewrite it.
The database is restored.

Chris replies:

> Once that is working, it should be a straightforward enhancement to trigger the compactor
(minus deletion of the source file) to run from all (or better yet, just those missing a corresponding
header) valid btree roots in the file (making snapshot databases of any state which might
have been lost due to a restart). If a couch has been restarting frequently, the recovery
might require creating a number of snapshots and then using the replicator to merge all the
snapshots back into one database.


Jan sums up:

> there's two scenarios so far we need to cover: a) user's been using couch, got restarted,
data is "lost" (i.e. data is at the end of the file, but the header isn't written) and b)
user has been using couch, data got "lost", user kept using couch, so there is maybe data
at the end of the file without a valid header and then some valid data and then some more
unreferenced data. for a) simply stopping at the first valid header and doing the restore
is ok. is harder as it is a potential full db scan and we can't just fix up the tree, in that
case mikeal suggested we should copy these docs to a new database to later replicate from

== Mailing list discussion ==

Catching up with progress as you drink your morning coffee? See [Jan's first post on the tool](http://mail-archives.apache.org/mod_mbox/couchdb-dev/201008.mbox/%3c8385F758-360B-425A-ACBD-03C898BFDA21@apache.org%3e)
and the subsequent thread.

== Git status

As of 10 August, 2010.

![git_status](http://wiki.couchone.com:5984/pages/recovery-tool/git.png "Git status")

== Proposed Recover Procedure (user side of things)

 1. Make a backup of everything.
 2. Stop CouchDB 1.0.0.
 3. Install CouchDB 1.0.1.
   3.1. Point database_dir at the original installation.
 4. Set in local.ini:
    [couchdb]
    recovery = true
 5. Start CouchDB 1.0.1.
 6. Wait until CouchDB says "Time to Relax.".
  6.1. CouchDB will have switched te config to `recovery = false`.
  6.2. CouchDB will 

Mime
View raw message