Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 25DE9D59E for ; Thu, 11 Oct 2012 23:32:44 +0000 (UTC) Received: (qmail 4617 invoked by uid 500); 11 Oct 2012 23:32:42 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 4559 invoked by uid 500); 11 Oct 2012 23:32:42 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 4551 invoked by uid 99); 11 Oct 2012 23:32:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Oct 2012 23:32:42 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.212.170] (HELO mail-wi0-f170.google.com) (209.85.212.170) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Oct 2012 23:32:34 +0000 Received: by mail-wi0-f170.google.com with SMTP id hm2so68174wib.5 for ; Thu, 11 Oct 2012 16:32:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=t3LaSwlpkpk9vTSL1MWKFMLCXa5mijzklksHx9m9ZdM=; b=QzpP0jvtp767w9UZsvvLKDjtamBj3X9r0rXnFE4DxV2MudfEnlW+OV+WZvNUVnYIy8 4/99fir+0xSvcbUXElKDQ5zP6eWeKCgxxwt7a6DFbrbJnduv9xcT61Y+aRgxTEVLNXNH LdILhcVg2xcEE/IEl0/3jjFsdloRsnDV6Eqv7hsywMZovyjwtQLluK8rRKwwgtpD19Md oYwbd4UDP/2efd6RaH5v+XCXKsCx9yhaIBXScU4Ph6c25/pAPe6EVHWMfKLT7Ht9jnjg /mvcdV8Zk+zMqNcv9RIeBIJls/JG/pYM+psYKSFDs7QWxUUKzbiF7O4p5AKqOoasjbQt pJeA== MIME-Version: 1.0 Received: by 10.180.78.40 with SMTP id y8mr1472932wiw.7.1349998333618; Thu, 11 Oct 2012 16:32:13 -0700 (PDT) Received: by 10.194.94.198 with HTTP; Thu, 11 Oct 2012 16:32:13 -0700 (PDT) X-Originating-IP: [70.36.146.209] In-Reply-To: References: Date: Thu, 11 Oct 2012 16:32:13 -0700 Message-ID: Subject: Re: pushing limits and possible corrupt db From: Jason Konrad To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQnYqsfRNGoHveM/7BvMhEbBnT9xq19cNgA1UYxQX6WnT8yqP68A0J0/UzVdj8fJUVTcDQuC There was plenty of room left on the disk so I don't think that was the problem. I've managed to find this at the end of the .couch file. 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000390 01 00 00 00 6b ba 06 44 2d c7 74 3f 55 7e 40 74 |....k..D-.t?U~@t| 000003a0 1a 80 75 48 1c 83 68 0b 64 00 09 64 62 5f 68 65 |..uH..h.d..db_he| 000003b0 61 64 65 72 61 05 62 05 7d 07 b3 61 00 68 02 6e |adera.b.}..a.h.n| 000003c0 05 00 7d c5 1a b1 27 68 02 62 00 92 e5 46 62 00 |..}...'h.b...Fb.| 000003d0 07 26 6d 68 02 6e 05 00 32 e5 1a b1 27 62 00 9a |.&mh.n..2...'b..| 000003e0 0b b3 68 02 6e 05 00 6a 82 92 f3 0c 6a 61 00 64 |..h.n..j....ja.d| 000003f0 00 03 6e 69 6c 64 00 03 6e 69 6c 62 00 00 03 e8 |..nild..nilb....| 00000400 This looks like a valid header according to another post on the mailing list which mentions the format "01 00 00 00 .... .db_header .... 00 00 03 E8". I have not tried to open the .couch file yet since I have to setup another couchdb to use. Will be trying that soon. On Thu, Oct 11, 2012 at 3:10 PM, Paul Davis wrote: > Definitely sounds like an emfile error. Could be that erlang > translates that when opening a file but I'd have to check. I have seen > issues with .couch files having issues when running out of disk space > and the like. To recover, I would make a copy of your .couch file, and > then start truncating it a bit at a time to try and find the last > valid header that you can read from. > > On Thu, Oct 11, 2012 at 3:38 PM, Jason Konrad wrote: >> Today my Couch (1.0.1) took a turn for the worse. Couchdb seemed to be >> stuck in some sort of error loop which rendered all databases useless. >> I managed to isolate the problem to a specific database which was >> causing the following error. >> >> [Thu, 11 Oct 2012 17:26:02 GMT] [error] [<0.595.0>] ** Generic server >> <0.595.0> terminating >> ** Last message in was {pread_iolist,170475046269} >> ** When Server state == {file,{file_descriptor,prim_file,{#Port<0.2136>,23}}, >> 0,170475073648} >> ** Reason for termination == >> ** {{badmatch,{ok,<<99,111,109,104,2,110,5,0,186,181,74,98,39,104,2,98,0,0 >> ....SNIP... >> 106,97,0,100,0,3,110,105,108,100,0,3,110,105,108,98,0,0, >> 3,232>>}}, >> [{couch_file,read_raw_iolist_int,3}, >> {couch_file,handle_call,3}, >> {gen_server,handle_msg,5}, >> {proc_lib,init_p_do_apply,3}]} >> >> I stopped couchdb and moved the entire .couch file out of the database >> directory and then started couchdb. After that couchdb appeared to be >> normal with the other databases. The database in questions is ~159GB >> with ~9M documents and each document has 2-10 attachments. >> >> I have recently been running up against some system_limit errors which >> may have something to do with this although I have not seen any of >> these errors today. I'm also trying to track down what system_limit >> I'm hitting. If it was open files wouldn't it be a emfile error? >> >> ** {{badmatch,{error,system_limit}}, >> [{couch_file,sync,1}, >> {couch_db_updater,commit_data,2}, >> {couch_db_updater,update_docs_int,5}, >> {couch_db_updater,handle_info,2}, >> {gen_server,handle_msg,5}, >> {proc_lib,init_p_do_apply,3}]} >> >> >> Any thoughts or suggestions would be appreciated. >> >> -Jason