lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Out of memory?
Date Wed, 12 Dec 2007 04:09:52 GMT
Bob,

Move the following line in your if block:

Sort sort = new Sort(sortColumn, desc);
That will fix your OOM problem.

Otis 

--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Bob Daha <neokalus@yahoo.com>
To: java-user@lucene.apache.org
Sent: Monday, December 10, 2007 5:42:27 PM
Subject: Re: Out of memory?

Interesting... I didn't explicitly turn that on... I'm creating a query
 using query parser, then executing a search using that query and a
 sort.  Would this algorithmically somehow invoke the FieldCacheImpl?  If
 so, anyway I can turn it off?  Basically the caching wouldn't be
 valuable anyway given the volatility of the index, and I'm not that concerned
 with speed.  (seeing most queries take well under 20ms)

Here's the searcher thread, if the code would help... note rd is my
 RAMDirectory object - it is never closed / reopened for the life of the
 application.

  public tps_search_results searchIndex(String query, String
 sortColumn, boolean desc) {
    ArrayList<Long> results = new ArrayList<Long>();

    try {

      if (!readerTPS.isCurrent()) {
        searcher.close();
        readerTPS.close();
        readerTPS = IndexReader.open(rd);
        searcher  = new IndexSearcher(readerTPS);
      }

      Query q = parserTPS.parse(query);

      Sort sort = new Sort(sortColumn, desc);

      Hits hits = searcher.search(q, sort);

      for (Iterator<Hit> it = hits.iterator(); it.hasNext(); ) {
        Hit hit = it.next();

        results.add(Long.parseLong(hit.get("ticket_id")));
      }

    } catch (Exception e) {
      rlog.exception(e);
    }

    rlog.queryComplete(": Results count: " + results.size());

    tps_search_results ret_results = new tps_search_results(results);

    return ret_results;
  }


----- Original Message ----
From: Chris Lu <chris.lu@gmail.com>
To: java-user@lucene.apache.org
Sent: Monday, December 10, 2007 2:17:26 PM
Subject: Re: Out of memory?


Looks like you are using FieldCacheImpl to count search results for
each category. ( or called facet search by a fancy name ).

Well, it's a cache and the terms are loaded in the memory.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer Twenga got 2.6 Million Euro funding!
http://www.techcrunch.com/2007/12/07/twenga-gets-e26-million-for-product-search/

On Dec 10, 2007 11:56 AM, Bob Daha <neokalus@yahoo.com> wrote:
> Hello,
>
> I'm building a ticketing system for my company and am using Lucene
 for some of the more complicated queries.  I'd say my application
 differs
 from the typical lucene application in that my documents are
 (re)-indexed more frequently, the query load is actually relatively
 light, and
 most of the indexed document data is numerical.  (ticket id, owner id,
 date ranges, etc are numbers and there's just one giant "text"
 searchable field.)
>
> I just got an out of memory exception on my app.  When I fired up
 jhat, I was very surprised at the object allocation:
>
> 7862 instances of class org.apache.lucene.index.Term
> 7787 instances of class org.apache.lucene.index.TermInfo
> 504 instances of class org.apache.lucene.index.FieldInfo
> 224 instances of class org.apache.lucene.index.TermBuffer
> 79 instances of class
 org.apache.lucene.index.CompoundFileReader$CSIndexInput
> 75 instances of class org.apache.lucene.index.SegmentTermEnum
> 70 instances of class org.apache.lucene.store.RAMFile
> 64 instances of class org.apache.lucene.store.RAMInputStream
> 63 instances of class org.apache.lucene.index.FieldInfos
> 20 instances of class
 com.facebook.tps_search_server.tps_search_ticket
> 20 instances of class
 com.facebook.tps_search_server.tps_search_ticket$Isset
> 10 instances of class org.apache.lucene.index.IndexReader$FieldOption
> 9 instances of class
 org.apache.lucene.index.IndexFileDeleter$RefCount
> 8 instances of class
 org.apache.lucene.index.CompoundFileReader$FileEntry
> 8 instances of class
 org.apache.lucene.index.CompoundFileWriter$FileEntry
> 8 instances of class org.apache.lucene.index.SegmentReader$Norm
> 7 instances of class org.apache.lucene.document.FieldSelectorResult
> 5 instances of class org.apache.lucene.document.Field$TermVector
> 5 instances of class org.apache.lucene.queryParser.Token
> 4 instances of class org.apache.lucene.document.Field$Index
> 4 instances of class org.apache.lucene.search.FieldCacheImpl$Entry
> 3 instances of class com.facebook.thrift.protocol.TBinaryProtocol
> 3 instances of class com.facebook.thrift.transport.TSocket
> 3 instances of class org.apache.lucene.document.Field$Store
> 3 instances of class org.apache.lucene.index.SegmentInfos
> 3 instances of class org.apache.lucene.search.BooleanClause$Occur
> 2 instances of class
 com.facebook.thrift.protocol.TBinaryProtocol$Factory
> 2 instances of class
 com.facebook.thrift.server.TThreadPoolServer$WorkerProcess
> 2 instances of class com.facebook.thrift.transport.TTransportFactory
> 2 instances of class
 org.apache.lucene.analysis.standard.StandardAnalyzer
> 2 instances of class org.apache.lucene.index.SegmentInfo
> 2 instances of class
 org.apache.lucene.queryParser.QueryParser$Operator
> 2 instances of class org.apache.lucene.search.DefaultSimilarity
> 2 instances of class org.apache.lucene.search.FieldSortedHitQueue$2
> 2 instances of class org.apache.lucene.search.Sort
> 2 instances of class org.apache.lucene.search.SortField
> 2 instances of class org.apache.lucene.store.RAMDirectory
> 2 instances of class
 org.apache.lucene.store.SingleInstanceLockFactory
> 2 instances of class [Lorg.apache.lucene.search.SortField;
> ...
>
> Any ideas on why I'd have so many Term and TermInfo objects?  To give
 you a little more insight into my application, worker threads can be
 spawned by client requests that do one of 4 things - search, index
 tickets, deindex tickets, or reindex tickets.  (reindex a ticket is
 basically deindex then index ticket)  I'm using a RAMDirectory for all
 my
 searches and updates although I have another thread which backs this
 up to
 disc periodically.
>
> Thanks,
> B
>
>
>
>
>      

 ____________________________________________________________________________________
> Looking for last minute shopping deals?
> Find them fast with Yahoo! Search.

  http://tools.search.yahoo.com/newsearch/category.php?category=shopping

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org







    
  ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message