lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: out-of-memory when searching, paging does not work.
Date Sun, 14 May 2006 20:59:19 GMT

please review the advice in these archived messages, I think you'll find
them very applicable to your problem...

http://www.nabble.com/eliminating-scoring-for-the-sake-of-efficiency-t1603827.html#a4351614
http://www.nabble.com/Exact-date-search-doesn%27t-work-with-1.9.1--t1418643.html#a3833741



: Date: Sun, 14 May 2006 15:34:08 -0400
: From: Beady Geraghty <beadygeraghty@gmail.com>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Re: out-of-memory when searching, paging does not work.
:
: Here is the gist of the code:
:
:     Query query = new TermQuery( new Term("contents", q.toLowerCase()));
:
:
:     long start = new Date().getTime();
:     Hits hits = is.search(query);
:     long end = new Date().getTime();
:
:     System.err.println("Found " + hits.length() +
:       " document(s) (in " + (end - start) +
:       " milliseconds) that matched query '" +
:         q  + "'");
:
:
:     int ct = hits.length() ;
:     int ct2 = 400000;
:     int step = 10000;
:     int startct;
:     while (ct2 < ct ) {
:      startct = ct2;
:      for (int i = startct; i < startct+step; i++ ) {
:       if (ct2 >= ct ) {
:        break;
:       }
:       Document doc = hits.doc(ct2);
:       doc.get("filename");
:       ct2++;
:      }
:      System.out.println( "ct2 is " + ct2 );
:      ir.close();
:      is.close();
:      fsDir.close();
:      ir = null;
:      is = null;
:      fsDir = null;
:      fsDir = FSDirectory.getDirectory(indexDir, false);
:      ir = IndexReader.open(fsDir);
:      is = new IndexSearcher(ir);
:      hits = is.search(query);
:
:
:     }
:
: if  ct2 is set to 40,000 as oppose to 400,000 , I see some output before I
: get the out-of-memory.  If not, I get out of memory error almost instantly
: without any output.
:
: Is there a method call to clear the cache ?
:
: Thank you for your response.
:
:
: On 5/14/06, Erik Hatcher <erik@ehatchersolutions.com> wrote:
: >
: > Could you share at least some pseudo-code of what you're doing in the
: > loop of retrieving the "name" of each document?   Are you storing all
: > of those names as you iterate?
: >
: > Have you profiled your application to see exactly where the memory is
: > going?  It is surely being eaten by your own code and not Lucene.
: >
: >        Erik
: >
: >
: > On May 14, 2006, at 12:07 PM, Beady Geraghty wrote:
: >
: > > I have an out-of-memroy error when returning  many hits.
: > >
: > > I am still on Lucene 1.4.3
: > >
: > > I have a simple term query.  It returned 899810 documents.
: > > I try to retrieve the name of each document and nothing else
: > > and I ran out of memory.
: > >
: > > Instead of getting the names all at once, I tried to query again after
: > > every 10,000 document.
: > > I close the index reader, index searcher, and the fsDir and re-query
: > > for every 10000 documents.  This still doesn't work.
: > >
: > >> From another entry in the forum, it appears that the information
: > >> about
: > > the hits that I have skipped over are still kept even though I don't
: > > access them.  Am I understanding it correctly that if I start
: > > accessing
: > > from the 400000th documents onwards, some information about the
: > > 0-399999
: > > documents are still cached even though I have skipped over those.
: > > Is there a way to get the file name (and perhaps other information)
: > > of the
: > > remaining
: > > documents ?
: > >
: > > (I tried a different term query that returned a hit size of 400000,
: > > and I
: > > was able
: > > to get the names of them all without re-quering)
: > >
: > > I think that I see someone mentioned about  clearing the hit cache ,
: > > though I don't how this is done.
: > >
: > > Thank you in advance for any hints on dealing with this.
: >
: >
: > ---------------------------------------------------------------------
: > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > For additional commands, e-mail: java-user-help@lucene.apache.org
: >
: >
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message