lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Daha <neoka...@yahoo.com>
Subject Out of memory?
Date Mon, 10 Dec 2007 19:56:46 GMT
Hello,

I'm building a ticketing system for my company and am using Lucene for some of the more complicated
queries.  I'd say my application differs from the typical lucene application in that my documents
are (re)-indexed more frequently, the query load is actually relatively light, and most of
the indexed document data is numerical.  (ticket id, owner id, date ranges, etc are numbers
and there's just one giant "text" searchable field.)

I just got an out of memory exception on my app.  When I fired up jhat, I was very surprised
at the object allocation:

7862 instances of class org.apache.lucene.index.Term 
7787 instances of class org.apache.lucene.index.TermInfo 
504 instances of class org.apache.lucene.index.FieldInfo 
224 instances of class org.apache.lucene.index.TermBuffer 
79 instances of class org.apache.lucene.index.CompoundFileReader$CSIndexInput 
75 instances of class org.apache.lucene.index.SegmentTermEnum 
70 instances of class org.apache.lucene.store.RAMFile 
64 instances of class org.apache.lucene.store.RAMInputStream 
63 instances of class org.apache.lucene.index.FieldInfos 
20 instances of class com.facebook.tps_search_server.tps_search_ticket 
20 instances of class com.facebook.tps_search_server.tps_search_ticket$Isset 
10 instances of class org.apache.lucene.index.IndexReader$FieldOption 
9 instances of class org.apache.lucene.index.IndexFileDeleter$RefCount 
8 instances of class org.apache.lucene.index.CompoundFileReader$FileEntry 
8 instances of class org.apache.lucene.index.CompoundFileWriter$FileEntry 
8 instances of class org.apache.lucene.index.SegmentReader$Norm 
7 instances of class org.apache.lucene.document.FieldSelectorResult 
5 instances of class org.apache.lucene.document.Field$TermVector 
5 instances of class org.apache.lucene.queryParser.Token 
4 instances of class org.apache.lucene.document.Field$Index 
4 instances of class org.apache.lucene.search.FieldCacheImpl$Entry 
3 instances of class com.facebook.thrift.protocol.TBinaryProtocol 
3 instances of class com.facebook.thrift.transport.TSocket 
3 instances of class org.apache.lucene.document.Field$Store 
3 instances of class org.apache.lucene.index.SegmentInfos 
3 instances of class org.apache.lucene.search.BooleanClause$Occur 
2 instances of class com.facebook.thrift.protocol.TBinaryProtocol$Factory 
2 instances of class com.facebook.thrift.server.TThreadPoolServer$WorkerProcess 
2 instances of class com.facebook.thrift.transport.TTransportFactory 
2 instances of class org.apache.lucene.analysis.standard.StandardAnalyzer 
2 instances of class org.apache.lucene.index.SegmentInfo 
2 instances of class org.apache.lucene.queryParser.QueryParser$Operator 
2 instances of class org.apache.lucene.search.DefaultSimilarity 
2 instances of class org.apache.lucene.search.FieldSortedHitQueue$2 
2 instances of class org.apache.lucene.search.Sort 
2 instances of class org.apache.lucene.search.SortField 
2 instances of class org.apache.lucene.store.RAMDirectory 
2 instances of class org.apache.lucene.store.SingleInstanceLockFactory 
2 instances of class [Lorg.apache.lucene.search.SortField;
...

Any ideas on why I'd have so many Term and TermInfo objects?  To give you a little more insight
into my application, worker threads can be spawned by client requests that do one of 4 things
- search, index tickets, deindex tickets, or reindex tickets.  (reindex a ticket is basically
deindex then index ticket)  I'm using a RAMDirectory for all my searches and updates although
I have another thread which backs this up to disc periodically.

Thanks,
B




      ____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message