lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: umlaut normalisation
Date Tue, 27 Jan 2004 12:48:35 GMT
Thomas Scheffler wrote:

> Hi,
> 
> is that possible with lucene to use umlaut normalisation?
> For example Query: Hühnerstall --> Query: Huehnerstall.
> 
> This ofcause includes that the document was indexed with normalized umlauts.
> This issue is very important, because not every one starting a search
> against german documents may have a german keyboard.

It seems to me the best place would be to put this replacement in a 
custom Analyzer (perhaps extend GermanAnalyzer?).

> This brings me to the next problem. Currently only Luke delivers result
> for "Hühnerstall", my selfed implemented solution allways makes
> "huhnerstall" out of it in the query (Why?). But ther is no "huhnerstall"
> indexed.
>

Please check which Analyzer you're using in each case.

-- 
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message