lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Armbrust <daniel.armbrust.l...@gmail.com>
Subject Analyzer question
Date Mon, 08 Aug 2005 14:43:53 GMT
It is my understanding that the StandardAnalyzer will remove underscores 
- so "some_word" be indexed as 'some' and 'word'.

I want to keep the underscores, so I was thinking of changing over to an 
Analyzer that uses the WhiteSpaceTokenizer, LowerCaseFilter, and StopFilter.

What other tokenizing magic will I lose by changing away from the 
StandardAnalyzer?

Thanks,

Dan

-- 
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message