lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karsten F." <>
Subject RE: Faceting, Sort and DocIDSet
Date Mon, 20 Apr 2009 19:59:59 GMT

Hi David,

correct: you should avoid reading the content of a document inside a
Normaly that means to cache all you need in main memory. Very simple and
fast is a facet with only 255 possible values and exactly one value per
document. In this case you need only an byte[IndexReader.maxDoc()] array in
cache and an int[256] array for collecting the results
(we have 5 GByte to run lucene with a couple of facets).

About "facet". For me a facet corresponds to a field in lucene. So 300
facets are quite a lot.
Or did you mean two facets with 150 values each?

To find a good solution for your 100M Document, I have three questions:
 - How many hits per search?
 - More then one value of the facet per document/how many in average?

INDEXORDER means document number. 
MultiSearcher works also fine:
If you have one index for each year and for each of this indices the
indexorder in order of date, also the MultiSearcher will have correct
Take a look to the variable "int[] starts" in MultiSearcher.

David Seltzer wrote:
> Is INDEXORDER based on the DocumentID within each individual index? If so
> then the results could be interleaved. Anyone know how this behaves?

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message