lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: MoreLikeThis class in Lucene within Solr?
Date Tue, 12 Sep 2006 20:24:46 GMT

On Sep 12, 2006, at 3:41 PM, Michael Imbeault wrote:
> I haven't looked at the specifics of how MoreLikeThis determine  
> which items are similar; I'm mainly wondering about performance  
> here. Yesterday I tried to code myself a poor man's similarity  
> class (which was nothing more than doing a search with OR between  
> words and sorting by score), and the performance was abysmal (well,  
> I kinda expected it. 1000+ words queries on a 15 millions docs  
> collection, you don't expect miracles). At first glance I think it  
> searches for the most 'relevant' words, I'm I right? What kind of  
> performance are you getting with it?

Performance with MoreLikeThis is not an issue.  It has many  
parameters to tune how many terms are used in the query it builds,  
and it pulls these terms in an extremely efficient manner from the  
Lucene index.

I'm doing some traveling soon, which is always a good time to hack on  
something tractable like adding MoreLikeThis to Solr.  So your wish  
may be granted in a week :)


View raw message