lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Downing <>
Subject Underscore tokenization
Date Fri, 09 Jul 2004 15:36:23 GMT

I'm trying to put together an Analyzer that doesn't separate tokens on
the underscore character. What's the best / easiest way to achieve this?

I've tried removing the references to char code 95 in
StandardTokenizerTokenManager, but it doesn't seem to cut the mustard.
Should I be looking at modifying StandardTokenizer.jj and having javacc
generate my own tokenizer classes?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message