lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: suggestion for a CustomDirectory
Date Thu, 04 Dec 2003 18:28:58 GMT
Julien Nioche wrote:
> However in most cases the
> application would be faster because :
> - tree access to the Term (this is only the case for the Terms in the .tii)
> - no need to create up to 127 temporary Term objects (with creation of
> Strings and so on....)
> - limit garbage collecting

The .tii is already read into memory when the index is opened.  So the 
only savings would be the creation of (on average) 64 temporary Term 
objects per query.  Do you have any evidence that this is a substantial 
part of the computation?  I'd be surprised if it was.  To find out, you 
could write a program which compares the time it takes to call docFreq() 
on a set of terms (allocating the 64 temporary Terms) to what it takes 
to perform queries (doing the rest of the work).  I'll bet that the 
first is substantially faster: most of the work of executing a query is 
processing the .frq and .prx files.  These are bigger than the RAM on 
your machine, and so cannot be cached.  Thus you'll always be doing some 
disk i/o, which will likely dominate real performance.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message