lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: MoreLikeThisQuery term frequency caching
Date Fri, 10 Apr 2009 13:38:53 GMT
What was your approach to handling stale cache entries?  Did you flush  
it when you opened a new reader?

On Apr 7, 2009, at 2:28 AM, Richard Marr wrote:

> Hi all,
> I've been exploring MoreLikeThisQuery as part of a recent project and
> something that came out of that might be useful to others here.
> I found that using MoreLikeThisQuery could be quite slow for my use
> case, but that most of the time involved was spent looking up term
> frequencies to calculate weightings. Since those term frequencies
> usually don't need to be anywhere near real-time I found that caching
> them in a hashmap had a very good cost/benefit ratio for my
> application, speeding up MLT queries by an order of magnitude.
> My use case was possibly unusual in that I was looking at a limited
> vocabulary rather than full English, but in theory other applications
> that make use of the MLT class could benefit.
> So at this point I have some questions: (1) Have others experienced
> similar performance characteristics for MLT code? (2) Am I missing
> some fatal flaw in this approach? (3) Are the modifications worth
> sharing?
> Cheers,
> Rich
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message