lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ami dudu <>
Subject Enhance StandardTokenizer to support words which will not be tokenized
Date Wed, 03 Jun 2009 11:07:18 GMT

Hi, I'm using a StandardTokenizer which do great job for me but i need to
enhance it somehow to consider words like "c++" "c#", ".net" as is and not
tokenized it into "c" or "net".
I know that there are other tokenizers such as KeywordTokenizer and
WhitespaceTokenizer but they do not include the StandardTokenizer  logic.
Any ideas on what is the best way to add this enhancement?

View this message in context:
Sent from the Lucene - Java Developer mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message