Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 21D99101E6 for ; Fri, 21 Mar 2014 22:44:10 +0000 (UTC) Received: (qmail 33637 invoked by uid 500); 21 Mar 2014 22:44:08 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 33538 invoked by uid 500); 21 Mar 2014 22:44:08 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 33530 invoked by uid 99); 21 Mar 2014 22:44:07 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Mar 2014 22:44:07 +0000 Received: from localhost (HELO [192.168.1.4]) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Mar 2014 22:44:07 +0000 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: Corrupt database file From: Robert Samuel Newson In-Reply-To: Date: Fri, 21 Mar 2014 22:44:02 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <5019F936-6AA3-4632-991D-A67A6897A89C@apache.org> References: <1395402372.93156.YahooMailNeo@web171406.mail.ir2.yahoo.com> <1395408893.89620.YahooMailNeo@web171401.mail.ir2.yahoo.com> <19917F3A-355F-421D-9BE5-F2B075ECB44C@couchbase.com> <2350E5F5-51B2-465B-9A54-FAAB435E6E42@couchbase.com> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1874) "If the DB was corrupted because the disk became full, shouldn't the DB = still be fine but just missing the most recent commits?" Yes, that=92s the virtue of the append-only nature of database files, = though the code that detects file corruption happens when the md5 = checksums fail to verify, it=92s hard to imagine it being a false = positive. Did the compaction attempt fail? Can it be replicated? If not, I would = reluctantly truncate a few meg off the file and see if it can be opened = (do this when couchdb is not running). The actual corrupted file would = be useful to couchdb developers so that we could investigate the raw = data at the corruption site. What was the disk system here? RAID? filesystem? Would your disk = controllers reorder writers at all? B. On 21 Mar 2014, at 20:19, Tim Tisdall wrote: > If the DB was corrupted because the disk became full, shouldn't the DB > still be fine but just missing the most recent commits? Or would the = a > person need to truncate a certain number of bytes off the end of the = DB to > get it to read properly? >=20 > As for JSON file size... I always dump the DB into a GZ file and then = my > scripts work on it as a GZ'ed file. In my case the JSON is 20gb and = the gz > file is 3.5gb. Dealing with the file as a gz adds a little more = complexity > to the script you use to process it, though. >=20 >=20 >=20 > On 21 March 2014 15:46, Jens Alfke wrote: >=20 >>=20 >> On Mar 21, 2014, at 8:43 AM, Alexander Shorin = wrote: >>=20 >>> we don't have any knowledge about how it was repacked, what >>> changes it includes, how it different from original CouchDB 1.2.0 = and >>> so on and so forth. >>=20 >> Is that relevant? Nothing's come up in this thread that depends on = exact >> details of CouchDB internals. Can we focus on the issue at hand, = namely >> that the OP has a CouchDB that ran out of disk space and corrupted = its >> database and he'd like to recover the data? >>=20 >> (IIRC, if there were any source changes from stock 1.2 they were = minor, >> maybe just around branding. Maybe Jan, Dale, or Filipe remember more = about >> it?) >>=20 >>> CouchBase team probably should knows more about their products. >>=20 >> Couchbase's forums are not going to respond to support requests for a >> product that's been discontinued for over two years, from someone = who's >> (presumably) not a paying customer. >>=20 >> --Jens