lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: setTermInfosIndexDivisor
Date Thu, 25 Jun 2009 09:54:59 GMT
On Thu, Jun 25, 2009 at 5:40 AM, Ganesh<> wrote:
> I am updating status of the documents frequently. There will be huge number of deletes.
I do optimize the index once in a day.


> I want to know the usage for setTermInfosIndexDivisor.
> Directory dir = FSDirectory.getDirectory(indexPath);
> IndexReader reader =, true);
> reader.setTermInfosIndexDivisor(5);
> I reopen the IndexReader whenever there is any document added to Index.  Do i need to
set setTermInfosIndexDivisor(5); during re-opening of the index also. I tried this, first
time it accepted and second time onwards it throws "terms already loaded" expection.

In fact Lucene has a bug here: on reopen, your index divisor is not
properly carried over to the newly opened segments.  Worse, if you
attempt to call setTermInfosIndexDivisor, it'll throw an exception
because the already-opened readers have already loaded their terms
index.  I think the only workaround is to not use reopen.

>>Loaded terms might not dominate your memory consumption in side
>>lucene. Again, you should provide more information of indexing, the
>>environment and the situation where the error occurs.
> I do indexing with no norms with all default values.

OK.  As of 2.9 (not yet released), you should also call
IndexReader.setDisableFakeNorms(true), to prevent the creation of a
full array of "fake" norms.

> As per the documentation, it should subsample the terms loaded in to memory.

That's what termInfosIndexDivisor does, but if the memory used by your
actual index's terms index is smallish (run CheckIndex to see), this
setting won't help much anyway.

Are you sorting by field for any of your queries?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message