lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: How to proceed with Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?
Date Tue, 11 Jan 2005 23:45:49 GMT
Chuck Williams wrote:
> This is a nice solution!  By having MultiSearcher create the Weight, it
> can pass itself in as the searcher, thereby allowing the correct
> docFreq() method to be called.

Glad to hear it at least makes sense... Now I hope it works!

> I'm still left wondering if having MultiSearcher query all the
> RemoteSearchable's on every call to docFreq() within each TermQuery,
> PhraseQuery, SpanQuery and PhrasePrefixQuery is the way to go long term,
> although it seems like the best thing to do right now.  The calls only
> happen when the Weight's are created, so maybe it's not too bad.  Longer
> term, it might be better to distribute the idf information out to the
> RemoteSearchable's to minimize the required number of remote accesses
> for each Query.

I'm not sure exactly what you mean by "distribute the idf information 
out to the RemoteSearchable".  I think one might profitably implement a 
docFreq() cache in RemoteSearchable.  This could be a simple cache, or 
it could be fairly agressive, pre-fetching all the docFreqs.  (As an 
optimization, it could only pre-fetch those greater than 1, and, when a 
term is not in the cache, assume its docFreq is 1.  As a lossy 
optimization, it could only pre-fetch those greater than N, and somehow 
estimate those not in the cache.)  Is that what you meant?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message