From dev-return-21402-apmail-couchdb-dev-archive=couchdb.apache.org@couchdb.apache.org Mon Mar 12 05:01:13 2012 Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5A7E925E for ; Mon, 12 Mar 2012 05:01:13 +0000 (UTC) Received: (qmail 4180 invoked by uid 500); 12 Mar 2012 05:01:13 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 4026 invoked by uid 500); 12 Mar 2012 05:01:08 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 3980 invoked by uid 99); 12 Mar 2012 05:01:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Mar 2012 05:01:07 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of randall.leeds@gmail.com designates 209.85.210.52 as permitted sender) Received: from [209.85.210.52] (HELO mail-pz0-f52.google.com) (209.85.210.52) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Mar 2012 05:01:01 +0000 Received: by dadp12 with SMTP id p12so5776450dad.11 for ; Sun, 11 Mar 2012 22:00:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Fthxz8In55UKIg9KZ5lN9NAAGEESUpVs0yEk6cdn4bk=; b=0y8BOw4fYOrES22xiBWZPjuFLNjSVQsAiB+B6i6vUDjBM64AfaRIu036T9hWE+RKIh NXdxPK/m+A2L2ggzMMhQdo4a7ANEzRNRerznGZOgDs4AXk3TBOPlg506Asn7tIFKG6no MIqohG1pagVaTO85EqkzIrPwWu+6MvW464D6yzgQ4QbCn85Hv9JkkcXWUyn5J7KlNPJ8 Jfdfx0mDQwfCDCYWqrCYsYLDBI25PCjYbL4WEPsbRNVMy22H76o79lIXcMYPMcYUal2c 9oZ9rSaXGnuXbEDzxJBj80GL22D4nPwagA/E2lLJHhmEzFNQaj5ROqqkJ4CroNo253u1 c3xw== MIME-Version: 1.0 Received: by 10.68.233.99 with SMTP id tv3mr16636772pbc.73.1331528441552; Sun, 11 Mar 2012 22:00:41 -0700 (PDT) Received: by 10.68.216.67 with HTTP; Sun, 11 Mar 2012 22:00:41 -0700 (PDT) Received: by 10.68.216.67 with HTTP; Sun, 11 Mar 2012 22:00:41 -0700 (PDT) In-Reply-To: References: <5D5AF65F-88FE-4A19-8D92-800E956235B6@apache.org> <26E0BD0D-0047-44B1-925C-BBA68A3448A3@apache.org> <4F4FC68D.7080206@gmail.com> <697FCEB2-0071-4F08-A463-A073DBA0B315@apache.org> <4F5C9FAD.2040300@gmail.com> <4F5CBD3A.3050001@gmail.com> Date: Sun, 11 Mar 2012 22:00:41 -0700 Message-ID: Subject: Re: Crash of CouchDB 1.2.x From: Randall Leeds To: dev@couchdb.apache.org Content-Type: multipart/alternative; boundary=047d7b33db50c45f3d04bb049f86 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b33db50c45f3d04bb049f86 Content-Type: text/plain; charset=UTF-8 On Mar 11, 2012 7:40 PM, "Jason Smith" wrote: > > On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds wrote: > > I'm not sure what else you could provide after the fact. If your couch > > came back online automatically, and did so quickly, I would expect to > > see very long response times while the disk was busy freeing the old, > > un-compacted file. We have had some fixes in the last couple releases > > to address similar issues, but maybe there's something lurking still. > > I've got no other ideas/leads at this time. > > Another long shot, but you could try a filesystem that doesn't > synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think > ext2. I think you're referring to extents, which, IIRC, allow large, contiguous sections if a file to be allocated and freed with less bookkeeping and, therefore, fewer writes. This behavior is not any more or less synchronous. In my production experience, xfs does not show much benefit from this because any machine which contains more than one databases which are growing still results in file fragmentation that limits the gains from extents. I suspect, but have not tried to verify, that very large RAID stripe sizes that force pre allocation of larger blocks, might deliver some gains. I have an open ticket for a manual delete option which was designed to allow deletion of trashed files to occur during low volume hours or using tools like ionice. Unfortunately, I never got a chance to experiment with that set up in production, though I have seen ionice help significantly to keep request latency down when doing large deletes (just not in this particular use case). --047d7b33db50c45f3d04bb049f86--