couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rudi Benkovic <>
Subject Re: recovering data from an unfinished compaction db
Date Tue, 25 Sep 2012 08:53:01 GMT
Hello Paul,

Monday, September 24, 2012, 6:45:46 PM, you wrote:

> The compactor is written to flush batches of docs every 5K bytes and
> then write a header out ever 5M bytes (assuming default batch sizes).
> Its important to remember that this judged against #doc_info{} records
> which don't contain a full doc body. For documents with relatively few
> revisions we're looking at (rough guess) ~100 bytes per record, which
> is going to give us 50K docs per header commit. Seeing as the OP
> mentions lots of attachments this could give us a relatively large gap
> in the file to search for a header.

FWIW, indeed, the last valid header in this compaction DB was written
around 20GB from the end of the file. The easiest way to get CouchDB
up and running at that revision:

Open up the .couch file in a hex editor, search for "db_header" from
the end up. Header looks like this, in hex:

01 00 00 00 .... .db_header .... 00 00 03 E8

In-place truncate the file (Linux: truncate -c --size=<bytes>
file.couch) up to the last full header (byte size: position of E8 +
1). Feed it to CouchDB, it should work.

This still leaves me with 20GB of data to resurrect from the dead.
Recover-couchdb hacking ahead. :)

Best regards,

View raw message