Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@couchdb.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAAL6JQiaLByU7dg3J3zF8cR1r6XCbVAysn8asy8oUo296aV3yA@mail.gmail.com>
References: 
 <CAPinO9ek5xAjLLpYufwnRk3hxkn5e-5j59pR6ummdWpMthhuuA@mail.gmail.com>
	<5D5AF65F-88FE-4A19-8D92-800E956235B6@apache.org>
	<CAPinO9eazqjanJ3Lnv=6Ee1nxT4oUZYK=mtWk8CeJ18aJ5c5CA@mail.gmail.com>
	<26E0BD0D-0047-44B1-925C-BBA68A3448A3@apache.org>
	<4F4FC68D.7080206@gmail.com>
	<A2D10445-729F-4259-8888-EAAE830AB101@apache.org>
	<CAPinO9e8YstXbWyzSFs=pFd+UWrFDTiv4XTKZPbqbjPiF9qAJA@mail.gmail.com>
	<697FCEB2-0071-4F08-A463-A073DBA0B315@apache.org>
	<CAPinO9eL=8cM_2Vics4H3N2jiX2z=XVLOW==qyBTJ1D7xtqh7Q@mail.gmail.com>
	<A36FD89D-0C1C-4050-A15A-BD33E00E39B4@dionne-associates.com>
	<4F5C9FAD.2040300@gmail.com>
	<CAN-3CBJj6P=WYJhOJVTND0N4W9r6skc8RPUVKHfWVQMOUbLDjA@mail.gmail.com>
	<4F5CBD3A.3050001@gmail.com>
	<CAAL6JQiHY1Su4S-H7eJ3nxifVQ965yTm9N=+P=+9ZZf+7FN3rg@mail.gmail.com>
	<CAN-3CBL00_eyc8U6D9fh+XCJMMafc-uTmy3+zMUaHUssdBfEcg@mail.gmail.com>
	<CAAL6JQiaLByU7dg3J3zF8cR1r6XCbVAysn8asy8oUo296aV3yA@mail.gmail.com>
Date: Mon, 12 Mar 2012 10:32:11 +0000
Message-ID: 
 <CABvT1DFzUAF5beLqfR=8DjWg2R7NGD2tTn50mPRsP3C-EBvyqg@mail.gmail.com>
Subject: Re: Crash of CouchDB 1.2.x
From: Robert Newson <rnewson@apache.org>
To: dev@couchdb.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I can confirm that XFS is aggressive when deleting large files (other
i/o requests are slow or blocked while it does it). It has been
necessary to iteratively truncate a file instead of a simple 'rm' in
production to avoid that problem. Increasing the size of extent
preallocation ought to help considerably but I've not yet deployed
that change. I *can* confirm that you can't 'ionice' the rm call,
though.

B.

On 12 March 2012 05:00, Randall Leeds <randall.leeds@gmail.com> wrote:
> On Mar 11, 2012 7:40 PM, "Jason Smith" <jhs@iriscouch.com> wrote:
>>
>> On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds <randall.leeds@gmail.com>
> wrote:
>> > I'm not sure what else you could provide after the fact. If your couch
>> > came back online automatically, and did so quickly, I would expect to
>> > see very long response times while the disk was busy freeing the old,
>> > un-compacted file. We have had some fixes in the last couple releases
>> > to address similar issues, but maybe there's something lurking still.
>> > I've got no other ideas/leads at this time.
>>
>> Another long shot, but you could try a filesystem that doesn't
>> synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think
>> ext2.
>
> I think you're referring to extents, which, IIRC, allow large, contiguous
> sections if a file to be allocated and freed with less bookkeeping and,
> therefore, fewer writes. This behavior is not any more or less synchronou=
s.
>
> In my production experience, xfs does not show much benefit from this
> because any machine which contains more than one databases which are
> growing still results in file fragmentation that limits the gains from
> extents.
>
> I suspect, but have not tried to verify, that very large RAID stripe size=
s
> that force pre allocation of larger blocks, might deliver some gains.
>
> I have an open ticket for a manual delete option which was designed to
> allow deletion of trashed files to occur during low volume hours or using
> tools like ionice. =A0Unfortunately, I never got a chance to experiment w=
ith
> that set up in production, though I have seen ionice help significantly t=
o
> keep request latency down when doing large deletes (just not in this
> particular use case).