lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Term Frequency vector consumes memory
Date Tue, 30 Jun 2009 16:18:18 GMT
In Lucene, a Term Vector is a specific thing that is stored on disk  
when creating a Document and Field.  It is optional and off by  
default.  It is separate from being able to get the term frequencies  
for all the docs in a specific field.  The former is decided at  
indexing time and there is no way to remove it w/o reindexing.   
Furthermore, it is not loaded into memory by the IndexReader.  Term  
Frequencies are accessed via the TermDocs.

Can you clarify a bit more what you are looking to do?  Perhaps some  
sample code will help demonstrate what you'd like to turn off, as I am  
not clear on your question.

Cheers,
Grant

On Jun 30, 2009, at 3:37 AM, Ganesh wrote:

> At the end of the day, I used to build the stats of top indexed  
> terms. I enabled term frequency for the single field. It is working  
> fine. I could able to get the top terms and its frequencies. It  
> consumes huge amount of RAM. My index size is 5 GB and has 8 million  
> records. If i didn't enable term vector then i could do index up to  
> 17 GB with 40 million records.
>
> When IndexReader/ Searcher is opened, whether it will load all term  
> vector frequncies?
>
> Consider i have enabled this option and indexed say 5GB, Now i don't  
> want the Reader / Searcher to load term vector. I want to switch off  
> this feature? Is that possible without re-indexing?
>
> Regards
> Ganesh
> Send instant messages to your online friends http://in.messenger.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message