Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 85A809046 for ; Mon, 12 Mar 2012 10:32:14 +0000 (UTC) Received: (qmail 71833 invoked by uid 500); 12 Mar 2012 10:32:13 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 71766 invoked by uid 500); 12 Mar 2012 10:32:13 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 71706 invoked by uid 99); 12 Mar 2012 10:32:13 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Mar 2012 10:32:13 +0000 Received: from localhost (HELO mail-gx0-f180.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Mar 2012 10:32:12 +0000 Received: by gglu1 with SMTP id u1so2776426ggl.11 for ; Mon, 12 Mar 2012 03:32:11 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.202.38 with SMTP id kf6mr18377348igc.30.1331548331739; Mon, 12 Mar 2012 03:32:11 -0700 (PDT) Received: by 10.42.99.195 with HTTP; Mon, 12 Mar 2012 03:32:11 -0700 (PDT) In-Reply-To: References: <5D5AF65F-88FE-4A19-8D92-800E956235B6@apache.org> <26E0BD0D-0047-44B1-925C-BBA68A3448A3@apache.org> <4F4FC68D.7080206@gmail.com> <697FCEB2-0071-4F08-A463-A073DBA0B315@apache.org> <4F5C9FAD.2040300@gmail.com> <4F5CBD3A.3050001@gmail.com> Date: Mon, 12 Mar 2012 10:32:11 +0000 Message-ID: Subject: Re: Crash of CouchDB 1.2.x From: Robert Newson To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I can confirm that XFS is aggressive when deleting large files (other i/o requests are slow or blocked while it does it). It has been necessary to iteratively truncate a file instead of a simple 'rm' in production to avoid that problem. Increasing the size of extent preallocation ought to help considerably but I've not yet deployed that change. I *can* confirm that you can't 'ionice' the rm call, though. B. On 12 March 2012 05:00, Randall Leeds wrote: > On Mar 11, 2012 7:40 PM, "Jason Smith" wrote: >> >> On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds > wrote: >> > I'm not sure what else you could provide after the fact. If your couch >> > came back online automatically, and did so quickly, I would expect to >> > see very long response times while the disk was busy freeing the old, >> > un-compacted file. We have had some fixes in the last couple releases >> > to address similar issues, but maybe there's something lurking still. >> > I've got no other ideas/leads at this time. >> >> Another long shot, but you could try a filesystem that doesn't >> synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think >> ext2. > > I think you're referring to extents, which, IIRC, allow large, contiguous > sections if a file to be allocated and freed with less bookkeeping and, > therefore, fewer writes. This behavior is not any more or less synchronou= s. > > In my production experience, xfs does not show much benefit from this > because any machine which contains more than one databases which are > growing still results in file fragmentation that limits the gains from > extents. > > I suspect, but have not tried to verify, that very large RAID stripe size= s > that force pre allocation of larger blocks, might deliver some gains. > > I have an open ticket for a manual delete option which was designed to > allow deletion of trashed files to occur during low volume hours or using > tools like ionice. =A0Unfortunately, I never got a chance to experiment w= ith > that set up in production, though I have seen ionice help significantly t= o > keep request latency down when doing large deletes (just not in this > particular use case).