Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C3F7CDD22 for ; Mon, 24 Sep 2012 16:46:54 +0000 (UTC) Received: (qmail 68340 invoked by uid 500); 24 Sep 2012 16:46:53 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 68199 invoked by uid 500); 24 Sep 2012 16:46:53 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 68191 invoked by uid 99); 24 Sep 2012 16:46:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Sep 2012 16:46:53 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.212.52 as permitted sender) Received: from [209.85.212.52] (HELO mail-vb0-f52.google.com) (209.85.212.52) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Sep 2012 16:46:47 +0000 Received: by vbjk17 with SMTP id k17so5835481vbj.11 for ; Mon, 24 Sep 2012 09:46:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=iPn9bEKp4oKqJ3CGGCzut3SLol4b8+3lie4EntNQRc8=; b=cmkntW/XZURYXiShEdimn3r9V0nofx8BrjKHdb9cJyto/Z4KaKBRyC/Z28eA6+j2Y0 JQWs0kVP90HFVXMFYcJMIuPML3zvFf6NohP+Qlo84B26yOZednLaroGznmBDjLjyLRLE UUWHzxxeQM7N3TUo4v1ETtdHk7JCL6/OdLjDWARoBrHj2X8iR1rKNlmSFgT6nUJxSOlx p/qyUcrhtUdBkgQDuKSEr81A/B+jY8DOkKU8kUBVZDhSkq1VI+izDa52XP6RklR5NWYL RkwyYjtE5Txga2/jqOtOr0VbajZ6WCwnctvJ7KOjk+MHz0+PLvhiDpkMXznTWYK6mVha NqMA== Received: by 10.58.209.73 with SMTP id mk9mr7984066vec.25.1348505186495; Mon, 24 Sep 2012 09:46:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.112.197 with HTTP; Mon, 24 Sep 2012 09:45:46 -0700 (PDT) In-Reply-To: References: <1341622793.20120924143243@whiletrue.com> From: Paul Davis Date: Mon, 24 Sep 2012 11:45:46 -0500 Message-ID: Subject: Re: recovering data from an unfinished compaction db To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 The compactor is written to flush batches of docs every 5K bytes and then write a header out ever 5M bytes (assuming default batch sizes). Its important to remember that this judged against #doc_info{} records which don't contain a full doc body. For documents with relatively few revisions we're looking at (rough guess) ~100 bytes per record, which is going to give us 50K docs per header commit. Seeing as the OP mentions lots of attachments this could give us a relatively large gap in the file to search for a header. On Mon, Sep 24, 2012 at 11:17 AM, Tim Tisdall wrote: > Since this is the result of a compaction, shouldn't the header be at > the beginning of the file? (just testing my knowledge on how all this > works...) > > -Tim > > On Mon, Sep 24, 2012 at 12:09 PM, Robert Newson wrote: >> That does imply that the last valid header is a long way back up the >> file, though. >> >> On 24 September 2012 17:00, Paul Davis wrote: >>> I'd ignore the snappy error for now. There's no way this thing ran for >>> an hour and then suddenly hit an error in that code. If this is like a >>> bug I've seen before the reason that this runs out of RAM is due to >>> the code that's searching for a header not releasing binary ref counts >>> as it should be. >>> >>> The quickest way to fix this would probably be to go back and update >>> recover-couchdb to recognize the new disk format. Although that gets >>> harder now that snappy compression is involved. >>> >>> On Mon, Sep 24, 2012 at 10:32 AM, Dave Cottlehuber wrote: >>>> On 24 September 2012 15:02, Robert Newson wrote: >>>>> {badmatch,{error,snappy_nif_not_loaded} makes me wonder if this 1.2 >>>>> installation is even right. >>>>> >>>>> Can someone enlighten me? Is it possible to get this error spuriously? >>>> >>>> No. I'd be keen to see a bit of logfiles to understand what's not working. >>>> >>>>> Does running out of RAM cause erlang to unload NIF's? >>>> >>>> I don't think so on Windows. >>>> >>>> There's an R15B01 based build here: >>>> https://www.dropbox.com/sh/jeifcxpbtpo78ak/GG9fjWOyDt/Snapshots/20120524 >>>> that has a fix for a more recent version of Windows server than I have >>>> to address one NIF loading error, although there are a number of >>>> possible causes. >>>> >>>> @Rudi can you give this a go & report back? >>>> >>>> A+ >>>> Dave