lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Cunha de Almeida <almeida...@gmail.com>
Subject [PATCH] Bug on BrazilianAnalyzer
Date Mon, 17 Nov 2008 23:39:39 GMT
Following is the patch for what I think is a bug on the
BrazilianAnalyzer. The default stopwords list is all in lowercase, so
it will only work if the LowerCaseFilter comes first of if the
StopWordFilter is set to ignore case.

Since the LowerCaseFilter is instantiated anyway I just changed its
order. If there's some problem with that order, then please consider
setting StopWordFilter to ignore case.

Index: BrazilianAnalyzer.java
===================================================================
--- BrazilianAnalyzer.java	(revision 718407)
+++ BrazilianAnalyzer.java	(working copy)
@@ -131,10 +131,9 @@
 	public final TokenStream tokenStream(String fieldName, Reader
reader) { TokenStream result = new StandardTokenizer( reader );
 		result = new StandardFilter( result );
+		result = new LowerCaseFilter( result );
 		result = new StopFilter( result, stoptable );
 		result = new BrazilianStemFilter( result, excltable );
-		// Convert to lowercase after stemming!
-		result = new LowerCaseFilter( result );
 		return result;
 	}
 }

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message