lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: FuzzyQuery prefix length
Date Mon, 18 Oct 2004 20:44:36 GMT
Daniel Naber wrote:
> On Tuesday 12 October 2004 17:22, Doug Cutting wrote:
>>Which is worse: a person who searches for Photokopie~ in a 1000 document
>>collection does not find documents containing Fotokopie; or a person who
>>searches for Photokopie~ in a 1M document collection doesn't find
>>anything because it takes too long.  I think some relevant results are
>>better than none.
> 
> I disagree, as the user who doesn't get the "Fotokopie" matches will not 
> understand what's going on. He will assume that there are no such 
> documents, which is wrong.

I disagree.   For someone to assume that they would need a detailed 
understanding of how "~" works.  Such a person would likely also know 
whether initial characters are considered in the operation of "~".  Most 
users who use "~" would probably use it when they're uncertain of 
spelling, without a detailed understanding of how it works, and, most of 
the time, it will help them.

> If there's a timeout the user will at least 
> notice something is wrong. Besides that, it's the developers 
> responsibility to get things fast enough.

We're talking about the appropriate default.  Defaults are used by 
unsophisticated developers.  A system deployed by an unsophisticated 
developer should not suffer from erratic timeouts.  Users using the 
standard query syntax should enjoy a reasonable experience on 
multi-million document collections without having to tweak things.

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message