couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Klimer (JIRA)" <>
Subject [jira] [Commented] (COUCHDB-2040) Compaction fails when copying attachment
Date Wed, 29 Jan 2014 15:16:09 GMT


Igor Klimer commented on COUCHDB-2040:

Yes, I've removed the fixed version - I'll do the replication again during the weekend and
replace the old database with the fixed/replicated one. Seems counterintuitive, but the admins
were not comfortable with the amount of disk space both of the databases occupied and I can't
swap the databases during the day.
We'll try to do compaction at a regular rate from now on, now it seems it has even more benfits.
As far as I'm concerned this bug can now be closed if adding proper handling of such situations
is a more complicated task. Maybe a new task/improvement can be opened on Jira for it instead?

> Compaction fails when copying attachment
> ----------------------------------------
>                 Key: COUCHDB-2040
>                 URL:
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>            Reporter: Igor Klimer
> Orignal discussion from the user mailing list:
> Digest:
> During database compaction, the process fails at about 50% with the following error: (CouchDB 1.2.0, Windows Server 2008 R2 Enterprise).
> After server and CouchDB upgrade the error is still the same:
(CouchDB 1.5.0, Ubuntu 12.04.3 LTS (GNU/Linux 3.8.0-33-generic x86_64)).
> There was one prior attempt at compaction that failed because of insufficient disk space:
> After this initial failure, I've made sure that there's sufficient disk space for the
.compact file.
> The .compact file was always removed before trying compaction again.
> At the request of Robert Samuel Newson, I've also tried with an empty .compact file -
the results were the same:
> Our I/O subsystem consists of some RAID5 matrices - the admins claim that they've been
running error-free since inception ;) We have yet to run a parity check, since that'd require
taking the matrix offline and I'd rather not do that without exhausting other options.
> Config files from the 1.2.0/Windows server (since that's where the fault must have occured):
> default.ini:
> local.ini:
> Other than the default delayed_commits set to true, there are no options that could affect
fsync()ing and such.
> I've run:
> curl localhost:5984/ecrepo/_changes?include_docs=true
> curl localhost:5984/ecrepo/_all_docs?include_docs=true
> and both calls succeeded, which would suggest that a faulty (incorrect checksum/length)
is at fault somewhere.

This message was sent by Atlassian JIRA

View raw message