lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Lucene search optimization
Date Tue, 30 May 2006 15:55:48 GMT
Take a look at "FuzzyLikeThisQuery" in
contrib\queries.

I use it for name searches on large indexes. 
Unlike FuzzyQuery it:
a) limits the number of query terms produced
b) provides better ranking (disables idf factor which
otherwise boosts rare misspellings)

The cost of running a query is strongly related to the
quantity of terms in the query.
FuzzyQuery only limits the number of terms by quality
(which means you can unexpectedly produce a large
quantity of terms and therefore have a slow query).
FuzzyLikeThis is more explicit - it limits the
*quantity* of terms used (and automatically shortlists
to the best quality terms using the same edit-distance
metric as FuzzyQuery for ranking quality). 


Cheers,
Mark



	
	
		
___________________________________________________________ 
All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC
Magazine 
http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message