lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clemens Wyss DEV <clemens...@mysign.ch>
Subject porting a cutsom Analyzer from 3.6 -> 4.0
Date Sun, 09 Dec 2012 13:15:23 GMT
I have a CustomAnalyzer which overrides "public final TokenStream tokenStream ( String fieldName,
Reader reader )":
@Override
public final TokenStream tokenStream ( String fieldName, Reader reader )
{
boolean fieldRequiresExactMatching = IndexManager.getInstance().isExactMatchField( fieldName
);

Reader localreader = reader;
if ( !fieldRequiresExactMatching )
{
	NormalizeCharMap charMap = new NormalizeCharMap();
	charMap.add(",", " ");
<SNIP>
	// wrap/filter reader
	localreader = new MappingCharFilter( charMap, reader );			
}
TokenStream t = new WhitespaceAnalyzer( IndexManager.CURRENT_LUCENE_VERSION ).tokenStream(
fieldName, localreader );

if ( !fieldRequiresExactMatching )
{
	// apply stop word filter
	Set<String> stopWordSet = null;
<SNIP>
	if ( stopWordSet != null )
	{
		// wrap/filter stream
		StopFilter stopFilter = new StopFilter( IndexManager.CURRENT_LUCENE_VERSION, t, stopWordSet,
true );
		t = stopFilter;
	}
}
return t;
}

MappingCharFilter -> whiteSpace analysis - <if condition given> -> stop word filtering

As of Lucene 4.0 " protected TokenStreamComponents createComponents ( final String fieldName,
final Reader reader )" is to be overridden and  a TokenStreamComponents has tob e returned.
I don't see how to achieve this ... all I have is a TokenStream but no Tokenizer ...
Mime
View raw message