lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Øie <>
Subject RE: Strange Results with German Analyzer
Date Thu, 20 Dec 2001 11:54:16 GMT
take a look at the end of

	public final TokenStream tokenStream( String fieldName, Reader reader ) {
		TokenStream result = new StandardTokenizer( reader );
		result = new StandardFilter( result );
		result = new StopFilter( result, stoptable );
		result = new GermanStemFilter( result, excltable );
		// Convert to lowercase after stemming!
		result = new LowerCaseFilter( result );
		return result;

as you can see the analyzer converts all words to lowercase to save some
space, you can ofcourse remove the LowerCaseFilter) to get case sensetive
search. the reason why holland gives 1 and hollAnd returns 22 i can not

mvh karl øie

-----Original Message-----
From: Jan Stövesand []
Sent: 20. desember 2001 12:36
To: Lucene Users List
Subject: Strange Results with German Analyzer


I used a German Analyzer for Indexing and Searching. afaik, the search is
case insensitive. At least I get the same searchresults for


But, for some words the Analyzer behaves somewhat funny:

Holland -> 22 results
hollAnd -> 22 results
hollanD -> 22 results
HOLLAND -> 22 results

holland -> 1 result (!) which is NOT in the 22 results mentioned above.

I have no idea and my knowledge about Searching, stemming, indexing etc is,
well, small.


To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message