lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Un-used index files are not getting released
Date Tue, 09 May 2017 08:56:31 GMT
Something looks slightly out of sync, with _29ip.cfs shown by $ lsof but
not by $ ls but that could just be by chance of timing, lucene doing its
stuff behind the scenes.

What does $ ls -al show?

What are the 1761 files returned by the java listFiles() call?  Are you
sure there isn't something using your index directory for some non-lucene
purpose?  That's usually best avoided.


--
Ian.


On Mon, May 8, 2017 at 6:38 PM, Siraj Haider <siraj@jobdiva.com> wrote:

> Hi Ian,
> We do not open any IndexReader explicitly. We keep one instance on
> IndexWriter open (and never close) and for searching we use
> SearcherManager. I checked the lsof and did not find any files with delete
> status.
>
> Following is the output of lsof for lucene1:
> 0;lucene@lidxnj39:~[lucene@lidxnj39 ~]$ /usr/sbin/lsof | grep lucene1
> java      32332    lucene   71r      REG                8,1       606739
> 64749587 /lucene1/index/_29ip.cfs
> java      32332    lucene   73r      REG                8,1 191494805022
> 64749583 /lucene1/index/_2988.cfs
> java      32332    lucene   76r      REG                8,1      1423548
> 64749585 /lucene1/index/_29io.cfs
> java      32332    lucene   80r      REG                8,1      1204827
> 64749586 /lucene1/index/_29in.cfs
> java      32332    lucene   81r      REG                8,1      5453524
> 64749588 /lucene1/index/_29il.cfs
> java      32332    lucene   86r      REG                8,1      5453524
> 64749588 /lucene1/index/_29il.cfs
> java      32332    lucene   90r      REG                8,1     37530221
> 64749590 /lucene1/index/_29im.cfs
> java      32332    lucene   92r      REG                8,1      1204827
> 64749586 /lucene1/index/_29in.cfs
> java      32332    lucene   96r      REG                8,1     37530221
> 64749590 /lucene1/index/_29im.cfs
> java      32332    lucene  101r      REG                8,1      1423548
> 64749585 /lucene1/index/_29io.cfs
> java      32332    lucene  111r      REG                8,1     53364009
> 64749606 /lucene1/index/_29hj.cfs
> java      32332    lucene  114r      REG                8,1     53364009
> 64749606 /lucene1/index/_29hj.cfs
> java      32332    lucene  117r      REG                8,1 191494805022
> 64749583 /lucene1/index/_2988.cfs
> java      32332    lucene  119r      REG                8,1    195525434
> 64749601 /lucene1/index/_29fj.cfs
> java      32332    lucene  139r      REG                8,1    195525434
> 64749601 /lucene1/index/_29fj.cfs
>
> Following is the directory listing of the folder:
> 0;lucene@lidxnj39:~[lucene@lidxnj39 ~]$ ls -l /lucene1/index/
> total 187294328
> -rw-r--r--. 1 lucene mis         1451 May  8 13:31 _2988_8i.del
> -rw-r--r--. 1 lucene mis 191494805022 May  8 02:10 _2988.cfs
> -rw-r--r--. 1 lucene mis           65 May  8 13:26 _29fj_8.del
> -rw-r--r--. 1 lucene mis    195525434 May  8 11:21 _29fj.cfs
> -rw-r--r--. 1 lucene mis           24 May  8 12:50 _29hj_2.del
> -rw-r--r--. 1 lucene mis     53364009 May  8 12:46 _29hj.cfs
> -rw-r--r--. 1 lucene mis      5453524 May  8 13:29 _29il.cfs
> -rw-r--r--. 1 lucene mis     37530221 May  8 13:29 _29im.cfs
> -rw-r--r--. 1 lucene mis      1204827 May  8 13:30 _29in.cfs
> -rw-r--r--. 1 lucene mis      1423548 May  8 13:31 _29io.cfs
> -rw-r--r--. 1 lucene mis         1714 May  8 13:31 segments_2615
> -rw-r--r--. 1 lucene mis           20 May  8 13:31 segments.gen
>
> But when I get the number of files in that index folder using java
> (File.listFiles()) it lists 1761 files in that folder. This count goes down
> to a double digit number when I restart the tomcat.
>
> Thanks for looking into it.
>
> ------
> Regards
> -Siraj Haider
> (212) 306-0154
>
> -----Original Message-----
> From: Ian Lea [mailto:ian.lea@gmail.com]
> Sent: Friday, May 05, 2017 9:33 AM
> To: java-user@lucene.apache.org
> Subject: Re: Un-used index files are not getting released
>
> The most common cause is unclosed index readers.  If you run lsof against
> the tomcat process id and see that some deleted files are still open,
> that's almost certainly the problem.  Then all you have to do is track it
> down in your code.
>
>
> --
> Ian.
>
>
> On Thu, May 4, 2017 at 10:09 PM, Siraj Haider <siraj@jobdiva.com> wrote:
>
> > Hi all,
> > We recently switched to Lucene 6.5 from 2.9 and we have an issue that
> > the files in index directory are not getting released after the
> > IndexWriter finishes up writing a batch of documents. We are using
> > IndexFolder.listFiles().length to check the number of files in index
> > folder. We have even tried closing/reopening the
> > IndexWriter/SearcherManager/MMapDirectory after indexing each batch to
> > see if that would release the files but it didn't. When we shutdown
> > the tomcat and restart it, only then we see that number drop, which
> > proves that there were some deleted files still held by Lucene
> > somewhere. Can you please direct me on what needs to be checked?
> >
> > P.S. I apologize for the duplicate email, as I didn't see my
> > yesterday's email in the list.
> >
> > Regards
> > -Siraj
> >
> > ________________________________
> >
> > This electronic mail message and any attachments may contain
> > information which is privileged, sensitive and/or otherwise exempt
> > from disclosure under applicable law. The information is intended only
> > for the use of the individual or entity named as the addressee above.
> > If you are not the intended recipient, you are hereby notified that
> > any disclosure, copying, distribution (electronic or otherwise) or
> > forwarding of, or the taking of any action in reliance on, the
> > contents of this transmission is strictly prohibited. If you have
> > received this electronic transmission in error, please notify us by
> > telephone, facsimile, or e-mail as noted above to arrange for the return
> of any electronic mail or attachments. Thank You.
> >
>
> ________________________________
>
> This electronic mail message and any attachments may contain information
> which is privileged, sensitive and/or otherwise exempt from disclosure
> under applicable law. The information is intended only for the use of the
> individual or entity named as the addressee above. If you are not the
> intended recipient, you are hereby notified that any disclosure, copying,
> distribution (electronic or otherwise) or forwarding of, or the taking of
> any action in reliance on, the contents of this transmission is strictly
> prohibited. If you have received this electronic transmission in error,
> please notify us by telephone, facsimile, or e-mail as noted above to
> arrange for the return of any electronic mail or attachments. Thank You.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message