Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6A53810584 for ; Wed, 1 May 2013 11:14:23 +0000 (UTC) Received: (qmail 66535 invoked by uid 500); 1 May 2013 11:14:21 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 66048 invoked by uid 500); 1 May 2013 11:14:18 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 66015 invoked by uid 99); 1 May 2013 11:14:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 May 2013 11:14:16 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kxepal@gmail.com designates 74.125.82.51 as permitted sender) Received: from [74.125.82.51] (HELO mail-wg0-f51.google.com) (74.125.82.51) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 May 2013 11:14:10 +0000 Received: by mail-wg0-f51.google.com with SMTP id b12so1252218wgh.6 for ; Wed, 01 May 2013 04:13:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=+n2tFFrRJ6aeFM0HKtVe/8gTU/rvsNnk3g47szaT/Rk=; b=EeQ6mr2MVzT2GVtcMa0WDvtVSTlKFk5hWlt3DhUdtXiKC7/7SFFCJfm47fg9pJsxJQ t/bJHCXcF2zK7oSvxoVR6FuUFwy1jFZgIO3wESIBupG4bTq2oFr4RcB/34uBJfIERQ4G Rxg1FOVzITuOFm8nMutdbW6LwqZ1Ww9GzIjQMFKtFzft/4SM8vmqF62C8gNFT05rQRoy 6vJHWPaL0/1OgX0bFQPv1E1sPY/usp86qXLtGeE/hlyNV8AbfN70TGA52OKcVeFLX/lu RNlIMaDLwLyqzfYQzyIyBdfFIDJIUOvF1ZaLM5DQWnZ0vRnGm73W9cQz+HER4AQX6Ir+ Zaqw== MIME-Version: 1.0 X-Received: by 10.180.89.140 with SMTP id bo12mr12197953wib.22.1367406830554; Wed, 01 May 2013 04:13:50 -0700 (PDT) Received: by 10.180.74.162 with HTTP; Wed, 1 May 2013 04:13:50 -0700 (PDT) In-Reply-To: <20130429161625.GC15595@translab.its.uci.edu> References: <20130429062747.GB3218@translab.its.uci.edu> <20130429063244.GA15595@translab.its.uci.edu> <20130429161625.GC15595@translab.its.uci.edu> Date: Wed, 1 May 2013 15:13:50 +0400 Message-ID: Subject: Re: An old database causes couchdb 1.3.0 to crash getting single documents From: Alexander Shorin To: "user@couchdb.apache.org" Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org On Mon, Apr 29, 2013 at 8:16 PM, James Marca wrote: > > I was able to get couchdb 1.2.x running on this machine, but 1.1.x > dies on start. > > 1.2.x does not compact this db. I got some quick > erlang errors, then RAM usage slowly rose to 95% so I killed it. > > Other dbs of the same "generation" work fine...I can access them and > build views and so on. The only difference in the dbs is the data. > The problem one is the first one I started using, before I decided to > manually shard the data. All the dbs identical design docs and all > that. My wild ass guess is that it's something I did early on in the > early batches of model runs polluting the db. > > After a week of fighting this (my awareness of the scope of the > problem built slowly!), I'm thinking it might be easier to just > re-run my models, and re-generate the data...at least then the problem > is just CPU time. > > Thanks for the advice. > > James Hi James I can share your pain: for my practice I'd got a lot of broken databases that acts in similar way. Most of them was under high concurrent writes load: "deadman's" database receives data via replications from 2-3 sources + massive bulk updates + triggering _update handlers and all this goes in same time. And server always was with *delayed_commits: true*. They fall with various symptoms: from explicit badrecord_db error in logs or random weird crush reports at the middle of data processing till forcing CouchDB unstoppable consume whole system memory. However, I never hit such problems without delayed_commits. Do you also have this option enabled? > After a week of fighting this (my awareness of the scope of the > problem built slowly!), I'm thinking it might be easier to just > re-run my models, and re-generate the data...at least then the problem > is just CPU time. Yes, this is easiest way to workaround. Also better to keep somewhere backup copy for emerge cases. -- ,,,^..^,,,