lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: setTermInfosIndexDivisor
Date Thu, 25 Jun 2009 12:58:09 GMT
> The culprit here is sorting. I stopped sorting then memory consumption is
> reduced nearly 50%. Further for Testing purpose i set
> setTermInfosIndexDivisor to 50 then memory consumption is further reduced.
> 
> Currently i am sorting DateTime with minute resolution as single string.
> If i split it as Date, Hour and Minute then memory consumption in sorting
> will get reduced.

Use NumericRangeQuery from trunk lucene 2.9 by indexing longs from
Date.getTime(). This will help and offers you real numeric ranges very fast
and also the FieldCache only takes 8 bytes per doc (size of long).

Hopefully Lucene 2.9 will be released soon, but you can start to test it.

If you want to do sorting only (with Lucene 2.4), it is also good to simply
index the date as a plain text long number (using
Long.toString(Date.getTime())) and use SortField.LONG to sort (which does
the same). Only RangeQueries are not possible that way.

> ----- Original Message -----
> From: "Michael McCandless" <lucene@mikemccandless.com>
> To: <java-user@lucene.apache.org>
> Sent: Thursday, June 25, 2009 3:44 PM
> Subject: Re: setTermInfosIndexDivisor
> 
> 
> On Thu, Jun 25, 2009 at 6:09 AM, Ganesh<emailgane@yahoo.co.in> wrote:
> >
> > What about setTermInfosIndexDivisor??
> >
> > Directory dir = FSDirectory.getDirectory(indexPath);
> > IndexReader reader = IndexReader.open(dir, true);
> > reader.setTermInfosIndexDivisor(5);
> >
> > It supposed to load only one fifth of the terms available?? But there is
> no difference in memory consumption with / without settings this
> parameter.
> 
> No, it's loads 1/5th of the 1/128th of all the terms available.  Ie,
> by default Lucene loads every 128th term into RAM for searching.  If
> you call setTermInfosIndexDivisor(5), then it loads every 128*5 =
> every 640 terms.
> 
> > I reopen the IndexReader whenever there is any document added to Index.
> Do i need to set setTermInfosIndexDivisor(5); during re-opening of the
> index also. I tried this, first time it accepted and second time onwards
> it throws "terms already loaded" expection
> 
> Calling setTermInfosIndexDivisor on a reopened reader won't work,
> because of a [just now discovered] bug in Lucene.  The workaround (if
> you really must use setTermInfosIndexDivisor) is to not use reopen.
> 
> Mike
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> Send instant messages to your online friends http://in.messenger.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message