couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ralev <vladimir.ra...@gmail.com>
Subject Re: bigcouch/couchdb file descriptor leak during compaction
Date Fri, 31 Jan 2014 10:13:01 GMT
I tried to rule out a file system problem and I did these:

chmod -R 777 /opt/bigcouch
chown -R root /opt/bigcouch

Then ran the bigcouch as root.

I still have a leak, but it's for other files:

beam.smp  28679        root *086u      REG              254,2      8282
32250112
/opt/bigcouch/var/lib/.shards/80000000-9fffffff/db1/1a/3a/3e9b00e2e5e72df737bf30cd24ad.1376585909_design/dfa1fd4be3aecb20848cad2feb20e00a.view

beam.smp  28679        root *087u      REG              254,2      8285
32246907
/opt/bigcouch/var/lib/.shards/80000000-9fffffff/db1/07/d7/e18bfed619a6078c2b19fef66b2c.1371491971_design/dfa1fd4be3aecb20848cad2feb20e00a.view


The deleted files don't leak anymore. And there are no errors on the
bigcouch log. As far as I can tell this happens on all compactions. Even if
I pace them slowly. The machine runs out of memory eventually (because the
system limits are really high).

Can somebody point me to the source code of the couchdb used in bigcouch
0.4.2, I will add some extra logs here and there see if I can figure it out?





On Wed, Jan 29, 2014 at 2:22 AM, Robert Samuel Newson <rnewson@apache.org>wrote:

>
> Yes, anything moved to the .delete directory is fair game for deletion.
>
> CouchDB and BigCouch move the file there and then delete it. This is so,
> in the event of a crash, only one directory needs to be cleaned up rather
> than potentially expensive recursive sweep.
>
> As for why BigCouch fails to release your files, I don't know. Is this
> happening for *all* compactions or is it quite rare but has accumulated
> over a long period of time?
>
> The difference between 0.4.0 and 0.4.2 is small and there's nothing that
> would induce this issue afaik.
>
> B.
>
> On 28 Jan 2014, at 23:02, Vladimir Ralev <vladimir.ralev@gmail.com> wrote:
>
> > Erlang R15B03 (erts-5.9.3.1) [source] [64-bit] [smp:8:8]
> [async-threads:0]
> > [hipe] [kernel-poll:false]
> >
> > Bigcouch is 0.4.2 latest.
> > {"couchdb":"Welcome","version":"1.1.1","bigcouch":"0.4.2"}
> > I haven't found anything unusual about the disk/OS/FS, all Debian
> defaults.
> > But I will keep looking. I will try to look into the source code, is it
> > safe to assume all these deleted files are deleted by the compaction code
> > and not some other part of the system?
> >
> >
> >
> >
> > On Wed, Jan 29, 2014 at 12:23 AM, Robert Samuel Newson
> > <rnewson@apache.org>wrote:
> >
> >>
> >>
> >> What version of erlang is this? There are some to avoid, R14B02 being
> the
> >> most notable.
> >>
> >> curious that you have so many of these, anything odd about filesystem or
> >> disk?
> >>
> >> All that said, a restart is your only method of freeing these if
> bigcouch
> >> (0.4.0 I assume?) is not able to.
> >>
> >> B
> >>
> >> On 28 Jan 2014, at 21:49, Vladimir Ralev <vladimir.ralev@gmail.com>
> wrote:
> >>
> >>> Hi guys,
> >>>
> >>> I am monitoring a huge compaction right now and keeping an eye of the
> >>> system not to fail with emfile error.
> >>>
> >>> The number of file descriptor owned by couch is growing very fast. I
> can
> >> do:
> >>>
> >>> lsof -p CouchPID
> >>>
> >>> and get tons of these:
> >>>
> >>> beam.smp 21853 root *152u   REG  254,2        8306 30670917
> >>> /opt/bigcouch/var/lib/.delete/b4b3ab2330a9672d7138fb562ebf90dd
> (deleted)
> >>>
> >>> beam.smp 21853 root *153u   REG  254,2        8282 30671071
> >>> /opt/bigcouch/var/lib/.delete/a218b0088e72278990f848fd8b2de5d9
> (deleted)
> >>>
> >>> beam.smp 21853 root *154u   REG  254,2        8372 30670973
> >>> /opt/bigcouch/var/lib/.delete/05a22639d021929b31c982954ef9e99b
> (deleted)
> >>>
> >>> beam.smp 21853 root *155u   REG  254,2        8297 30671201
> >>> /opt/bigcouch/var/lib/.delete/6669ce0a6c235a977ea46ded37928338
> (deleted)
> >>>
> >>> beam.smp 21853 root *156u   REG  254,2        8294 30670974
> >>> /opt/bigcouch/var/lib/.delete/bd2a65a16205529faf9118f0bd6d26b1
> (deleted)
> >>>
> >>> beam.smp 21853 root *157u   REG  254,2        8294 30671159
> >>> /opt/bigcouch/var/lib/.delete/6b87bba6fd0b87c1bcaf47a2ba22aee4
> (deleted)
> >>>
> >>> beam.smp 21853 root *158u   REG  254,2        8294 30670975
> >>> /opt/bigcouch/var/lib/.delete/392e9bd3825ff953dc808f83f6cba97e
> (deleted)
> >>>
> >>>
> >>> Currently they are at 55000 such descriptors and they are never
> released.
> >>> Note that these files are indeed deleted, but the system doesn't
> release
> >>> the handles. I can't find a reference to similar problems. Is this a
> >> known
> >>> issue i should watch out for?
> >>>
> >>>
> >>> I suppose i can restart the system in between compactions to release
> the
> >>> files, but if you have any other advice its highly appreciated.
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message