lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ami dudu <amidu...@gmail.com>
Subject Enhance StandardTokenizer to support words which will not be tokenized
Date Wed, 03 Jun 2009 11:07:18 GMT

Hi, I'm using a StandardTokenizer which do great job for me but i need to
enhance it somehow to consider words like "c++" "c#", ".net" as is and not
tokenized it into "c" or "net".
I know that there are other tokenizers such as KeywordTokenizer and
WhitespaceTokenizer but they do not include the StandardTokenizer  logic.
Any ideas on what is the best way to add this enhancement?

Thanks,
Amid
-- 
View this message in context: http://www.nabble.com/Enhance-StandardTokenizer-to-support-words-which-will-not-be-tokenized-tp23849495p23849495.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message