lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <>
Subject Re: about contrib instantiated
Date Sat, 03 Jul 2010 14:48:45 GMT

2 jul 2010 kl. 08.32 skrev Li Li:

> I  have an index of
> about 8,000,000 document and the current index size is about 30GB. Is
> it possbile to use this contrib to speed up my search? I have enough
> memory for it.

In order to answer your question you'll need to benchmark using a lot  
of typical queries. My guess is that it will probably be about as fast  
as a RAMDirectory while consuming a lot more memory. It's hard to say  
for sure though.

II is faster than RD mainly due to the need for RD to unmarshall  
information from a byte stream to java instances, hence the name. As  
the index grows the time spent in RD unmarshalling will shrink  
compared to the time spent seeking (mainly in DocsEnum/ 
DocsAndPositionsEnum) and scoring documents. Thus executing queries on  
a large index using terms that are only available in a small portion  
of the documents should be faster on II than on RD, while exeuting  
queries using frequently occuring terms will consume about as much time.

(Perhaps the documentation should explain it this way rather than just  
state "Mileage may vary depending on term saturation".)

While benchmarking remember that RD might require a warm up period  
while II does not.

Feel free to report back with any findings.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message