lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christoph Kiehl" ...@sulu3000.de>
Subject Re: [PATCH] Refactoring QueryParser.jj, setLowercaseWildcardTerms()
Date Wed, 12 Feb 2003 18:39:20 GMT
Hi Doug,

> Also, I think we should lowercase prefix and wildcard queries by
> default.  This would fix one of the most frequently reported problems.
> Yes, it might also break folks who currently do case-sensitive
> wildcard queries, but I suspect they are far fewer than those who
> will continue to complain about the default case-sensitivity of
> wildcard searches. What do others think?

For the StandardAnalyzer this might work, but for the GermanAnalyzer, there
is also the problem with Umlauts (ä,ö,ü) turned into vowels (a,o,u) while
indexing. An example: "Häuser" is the plural of "Haus". If I index "Häuser"
it is stemmed to "hau". If I do for example a search for "häus*" nothing is
found, because "häus" is not stemmed. If I would analyze "häus*" I should
get "hau*". The problem is, that now you do not only get "Häuser" but also
"Haus" as result. But I think it is better to get more results than no
result. This is perhaps a special problem with the GermanAnalyzer. May be
there could be an option to use the Analyzer also for wildcard queries. So I
can turn it on in my case and defaults to off.
Hope you understand my problem ;)

Regards
Christoph

ps: Lucene otherwise really rocks ;)




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message