couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <...@iriscouch.com>
Subject Re: Crash of CouchDB 1.2.x
Date Mon, 12 Mar 2012 12:29:20 GMT
I seem to remember that, say, ext2 had more or less constant-time unlinking.

On Mon, Mar 12, 2012 at 10:32 AM, Robert Newson <rnewson@apache.org> wrote:
> I can confirm that XFS is aggressive when deleting large files (other
> i/o requests are slow or blocked while it does it). It has been
> necessary to iteratively truncate a file instead of a simple 'rm' in
> production to avoid that problem. Increasing the size of extent
> preallocation ought to help considerably but I've not yet deployed
> that change. I *can* confirm that you can't 'ionice' the rm call,
> though.
>
> B.
>
> On 12 March 2012 05:00, Randall Leeds <randall.leeds@gmail.com> wrote:
>> On Mar 11, 2012 7:40 PM, "Jason Smith" <jhs@iriscouch.com> wrote:
>>>
>>> On Mon, Mar 12, 2012 at 8:44 AM, Randall Leeds <randall.leeds@gmail.com>
>> wrote:
>>> > I'm not sure what else you could provide after the fact. If your couch
>>> > came back online automatically, and did so quickly, I would expect to
>>> > see very long response times while the disk was busy freeing the old,
>>> > un-compacted file. We have had some fixes in the last couple releases
>>> > to address similar issues, but maybe there's something lurking still.
>>> > I've got no other ideas/leads at this time.
>>>
>>> Another long shot, but you could try a filesystem that doesn't
>>> synchronously reclaim the space, like (IIRC) XFS, btrfs, or I think
>>> ext2.
>>
>> I think you're referring to extents, which, IIRC, allow large, contiguous
>> sections if a file to be allocated and freed with less bookkeeping and,
>> therefore, fewer writes. This behavior is not any more or less synchronous.
>>
>> In my production experience, xfs does not show much benefit from this
>> because any machine which contains more than one databases which are
>> growing still results in file fragmentation that limits the gains from
>> extents.
>>
>> I suspect, but have not tried to verify, that very large RAID stripe sizes
>> that force pre allocation of larger blocks, might deliver some gains.
>>
>> I have an open ticket for a manual delete option which was designed to
>> allow deletion of trashed files to occur during low volume hours or using
>> tools like ionice.  Unfortunately, I never got a chance to experiment with
>> that set up in production, though I have seen ionice help significantly to
>> keep request latency down when doing large deletes (just not in this
>> particular use case).



-- 
Iris Couch

Mime
View raw message