lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Naber <>
Subject Re: Inconsistent tokenizing of words containing underscores.
Date Mon, 29 Aug 2005 19:15:26 GMT
On Monday 29 August 2005 19:21, Jeremy Meyer wrote:

> The expected behavior is to sometimes treat a character as indicating a
> new token and other times to ignore the same character?

It depends on whether there are digits in the token.  It's documented in 
the javacc source for the tokenizer(?).



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message