lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Strange tokenization with StandardFilter
Date Wed, 23 Nov 2005 10:12:15 GMT

On 21 Nov 2005, at 18:54, yahootintin.11533894@bloglines.com wrote:

> I'm using a StandardFilter and seeing some strange tokenization.
>
> Here's
> the input:
> apache.org hosts lucene at apache.org.
>
> Here's the tokens it
> outputs:
>  apache.org
>  hosts
>  lucene
>  at
>  apacheorg
>
> Is this a bug
> that apache.org and apache.org. don't convert to the same token?


Didn't you just report this same issue?

The behavior certainly is not sensible in this case.  So I'd call it  
a bug, yes.  Again, the trailing '.' is the culprit.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message