lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: Memory Usage
Date Mon, 14 Nov 2005 05:19:56 GMT

On Nov 13, 2005, at 6:27 PM, Chris Hostetter wrote:

> I believe if you really want to determine settings like this after
> building the index, you'll need to do an initial build the index using
> best guess values -- then if the calculations you do once the index is
> built aren't close enough to your guesses to satisfy you, change  
> the value
> and optimize.

Good tip.  :)

>> From what i remember about how optimize works, it creates all new  
>> segments
> regardless of the previous state of the index -- and those new  
> segments
> should use the newly set values.

Yes, it will use the new values.  The one caveat is that if the index  
is already optimized, calling optimize() won't do anything.  Adding  
or deleting a single document is enough to trigger a rewrite, though.

Daniel, under the hood, there are two term dictionary files, with  
nearly identical structures: the main .tis file, and the index .tii  
file. (Mnemonic: .tis is TermInfoS, and .tii is TermInfosIndex.) If  
indexInterval is set to the default of 128, then the .tii file  
contains every 128th entry from the main file, plus a pointer to  
where that entry is located in the main file.

When you load up an IndexReader, the entire .tii file gets  
decompressed and loaded into RAM.  The number of entries in the .tii  
file corresponds directly to the RAM footprint.  If you want less RAM  
usage, that file has to get smaller.

Hoss's solution is the fastest way to find the best values for  
indexInterval -- you're rewriting the entire index, but it's faster  
than reindexing from scratch because you don't need to redo the IO or  
the analysis.  Few people will find it useful to tinker with this,  
but you're the exception, and I'll be interested to hear about your  


Marvin Humphrey
Rectangular Research

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message