lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Lecarme <math...@garambrogne.net>
Subject Re: Why exactly are fuzzy queries so slow?
Date Sat, 24 Nov 2007 17:48:18 GMT
fuzzy are simply not indexed.
If you wont to search quickly with fuzzy search, you should index word  
and their ngrams, it's the "do you mean" pattern.

you first select used word wich share ngram with the query word, the  
distance is computed with levenstein, and you use this word as a  
synonym.

M.

Le 24 nov. 07 à 17:36, Timo Nentwig a écrit :

> Hi!
>
> I search an 1.5 gig index and fuzzy queries are really slow;  
> something like
> avg. ~500ms (IndexSearcher.search(Query, HitCollector)).
>
> When performing exact queries I archieve response times <25ms. What  
> is it that
> makes fuzzy queries so slow? Increased index access due to more  
> terms, i.e.
> disk IO?
>
> And no, my fuzzy queries (fuzzy factor 0.8) don't blow up to a  
> boolean query
> with 100s clauses but maybe something...less than 10.
>
> Thanks
> Timo
>
> P.S. arent' there any "best practices" for lucene? Does everybody  
> have to find
> out on his own (over and over again) and spend a lot of time reading  
> and
> understanding lucene's code base?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message