lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yahootintin.11533...@bloglines.com
Subject Inconsistent StandardTokenizer behaviour
Date Tue, 22 Nov 2005 00:39:59 GMT
This is the results for the StandardTokenizer:
   input - output token -
output type
1. 1.2   - 1.2          - <HOST>
2. 1.2.  - 1.2          - <HOST>

3. a.b   - a.b          - <HOST>
4. a.b.  - a.b.         - <ACRONYM>
5.
www.apache.org  - www.apache.org  - <HOST>
6. www.apache.org. - www.apache.org.
- <ACRONYM>

Number 6 should still be a <HOST> type, shouldn't it?  This
causes problems for the StandardFilter.  Why is it saying its an <ACRONYM>?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message