lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: StopAnalyzer and apostrophes
Date Thu, 06 Apr 2006 16:26:31 GMT
I wrote:

> It looks like StopAnalyzer tokenizes by letter, and doesn't handle  
> apostrophes.  So, the input "I don't know" produces these tokens:
>
>     don
>     t
>     know
>
> Is that right?

It's not right.  StopAnalyzer does tokenize letter by letter, but 't'  
is a stopword, so the tokens are:

     don
     know

Phew, that's much more useful.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message