couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Samuel Newson <rnew...@apache.org>
Subject Re: bigcouch/couchdb file descriptor leak during compaction
Date Wed, 29 Jan 2014 00:22:28 GMT

Yes, anything moved to the .delete directory is fair game for deletion.

CouchDB and BigCouch move the file there and then delete it. This is so, in the event of a
crash, only one directory needs to be cleaned up rather than potentially expensive recursive
sweep.

As for why BigCouch fails to release your files, I don’t know. Is this happening for *all*
compactions or is it quite rare but has accumulated over a long period of time?

The difference between 0.4.0 and 0.4.2 is small and there’s nothing that would induce this
issue afaik.

B.

On 28 Jan 2014, at 23:02, Vladimir Ralev <vladimir.ralev@gmail.com> wrote:

> Erlang R15B03 (erts-5.9.3.1) [source] [64-bit] [smp:8:8] [async-threads:0]
> [hipe] [kernel-poll:false]
> 
> Bigcouch is 0.4.2 latest.
> {"couchdb":"Welcome","version":"1.1.1","bigcouch":"0.4.2"}
> I haven't found anything unusual about the disk/OS/FS, all Debian defaults.
> But I will keep looking. I will try to look into the source code, is it
> safe to assume all these deleted files are deleted by the compaction code
> and not some other part of the system?
> 
> 
> 
> 
> On Wed, Jan 29, 2014 at 12:23 AM, Robert Samuel Newson
> <rnewson@apache.org>wrote:
> 
>> 
>> 
>> What version of erlang is this? There are some to avoid, R14B02 being the
>> most notable.
>> 
>> curious that you have so many of these, anything odd about filesystem or
>> disk?
>> 
>> All that said, a restart is your only method of freeing these if bigcouch
>> (0.4.0 I assume?) is not able to.
>> 
>> B
>> 
>> On 28 Jan 2014, at 21:49, Vladimir Ralev <vladimir.ralev@gmail.com> wrote:
>> 
>>> Hi guys,
>>> 
>>> I am monitoring a huge compaction right now and keeping an eye of the
>>> system not to fail with emfile error.
>>> 
>>> The number of file descriptor owned by couch is growing very fast. I can
>> do:
>>> 
>>> lsof -p CouchPID
>>> 
>>> and get tons of these:
>>> 
>>> beam.smp 21853 root *152u   REG  254,2        8306 30670917
>>> /opt/bigcouch/var/lib/.delete/b4b3ab2330a9672d7138fb562ebf90dd (deleted)
>>> 
>>> beam.smp 21853 root *153u   REG  254,2        8282 30671071
>>> /opt/bigcouch/var/lib/.delete/a218b0088e72278990f848fd8b2de5d9 (deleted)
>>> 
>>> beam.smp 21853 root *154u   REG  254,2        8372 30670973
>>> /opt/bigcouch/var/lib/.delete/05a22639d021929b31c982954ef9e99b (deleted)
>>> 
>>> beam.smp 21853 root *155u   REG  254,2        8297 30671201
>>> /opt/bigcouch/var/lib/.delete/6669ce0a6c235a977ea46ded37928338 (deleted)
>>> 
>>> beam.smp 21853 root *156u   REG  254,2        8294 30670974
>>> /opt/bigcouch/var/lib/.delete/bd2a65a16205529faf9118f0bd6d26b1 (deleted)
>>> 
>>> beam.smp 21853 root *157u   REG  254,2        8294 30671159
>>> /opt/bigcouch/var/lib/.delete/6b87bba6fd0b87c1bcaf47a2ba22aee4 (deleted)
>>> 
>>> beam.smp 21853 root *158u   REG  254,2        8294 30670975
>>> /opt/bigcouch/var/lib/.delete/392e9bd3825ff953dc808f83f6cba97e (deleted)
>>> 
>>> 
>>> Currently they are at 55000 such descriptors and they are never released.
>>> Note that these files are indeed deleted, but the system doesn't release
>>> the handles. I can't find a reference to similar problems. Is this a
>> known
>>> issue i should watch out for?
>>> 
>>> 
>>> I suppose i can restart the system in between compactions to release the
>>> files, but if you have any other advice its highly appreciated.
>> 
>> 


Mime
View raw message