couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <>
Subject Re: Data loss in CouchDB 1.0.1
Date Tue, 07 Jun 2011 02:06:36 GMT
Thanks! I've used good tools. I would classify grep_couch as a "bad
tool" but maybe "best of class" :)

Compaction creates a new file on the fs and unlinks the old one. If
you compact regularly, you'll tend to have *lots* of ejson fragments
laying around in the un-(re)-allocated free parts of the fs.

That reminds me, the tool does not account for filesystem
fragmentation or anything that would make the doc not physically
contiguous on disk. Fortunately most docs are small (even in ejson!)
and they survive, especially if you've compacted a few times and you
have duplicate data on-disk but with dissimilar fragment locations.

On Tue, Jun 7, 2011 at 8:38 AM, Paul J. Davis
<> wrote:
> Jason,
> Good tool, but unless I'm mistaken, the issue here I that the data just doesn't exist
on disk. I think we're fairly sure that this isn't the 1.0.0 bug but something else. I'm leaning
towards something config specific but all of our theories appear to be incorrect given the
reported observations.
> On Jun 6, 2011, at 9:29 PM, Jason Smith <> wrote:
>> I once made a very simple CouchDB undelete tool. It scans your disk
>> device for anything that looks like the on-disk CouchDB JSON format.
>> I've recovered data with it, but notably, the _id and _rev are *not*
>> stored with the rest of a document, so you tend to get lots of docs
>> with no _id field. (I'm considering always having an "id" field to
>> dupe the "_id" in case I ever have to do that again.)
>> On Mon, Jun 6, 2011 at 3:35 PM, René Brüntrup <> wrote:
>>> Hello!
>>>> 1) Are you certain that you were in fact writing to the database on this
server and not the replica?  Can you share some access logs towards that end?
>>> We could not find the missing data in any of the replicated files. Each
>>> backup has an increasing number of documents with the least amount of
>>> missing data in the newest backup. Access logs from the 2011-03-08 are
>>> unfortunately missing, because the log rotation already removed them.
>>>> 2) Is it possible that you've inadvertently restored the database file from
a backup?
>>> No backup was created at this date and we do not have any mechanisms
>>> that could automatically restore the backups.
>>>> 3) Is it possible that you were writing "underneath" the encrypted LVM volume
for the past two months?
>>> Our system does not work without an initialized database that contains a
>>> number of user account and definition documents. But even if such an
>>> initialized database would have been available underneath the encrypted
>>> volume we would have noticed a data loss after changing the database,
>>> because the system was already in use before the 2011-03-08.
>>> We will check again, if there is a database file underneath the
>>> encrypted volume, but we cannot stop the system right now. When
>>> replicating the database we just noticed, that the timestamp of the
>>> source database was updated. The replication processes that where
>>> started after the 2011-03-08 and before the reboot of the server did not
>>> change the timestamp of the source database.
>>> Regards,
>>> René
>> --
>> Iris Couch

Iris Couch

View raw message