lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yahootintin.11533...@bloglines.com
Subject Re: Strange tokenization with StandardFilter
Date Wed, 23 Nov 2005 18:39:28 GMT
Yes, this is a repeat... I mailed this a few days before and it never made
it to the list so I reposted.  Now it suddenly appears... weird!

--- java-user@lucene.apache.org
wrote:

> On 21 Nov 2005, at 18:54, yahootintin.11533894@bloglines.com wrote:

> 
> > I'm using a StandardFilter and seeing some strange tokenization.

> >
> > Here's
> > the input:
> > apache.org hosts lucene at apache.org.

> >
> > Here's the tokens it
> > outputs:
> >  apache.org
> >  hosts

> >  lucene
> >  at
> >  apacheorg
> >
> > Is this a bug
> > that apache.org
and apache.org. don't convert to the same token?
> 
> 
> Didn't you just
report this same issue?
> 
> The behavior certainly is not sensible in this
case.  So I'd call it  
> a bug, yes.  Again, the trailing '.' is the culprit.

> 
> 	Erik
> 
> 
> ---------------------------------------------------------------------

> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For
additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message