lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiaolong Zheng <zhengxiaol...@gmail.com>
Subject How to prevent WordDelimiterFilter tokenize the string with underscore?
Date Wed, 15 Jun 2016 19:32:20 GMT
Hi,

How can I prevent WordDelimiterFilter tokenize the string with underscore,
e.g. word_with_underscore.

I am using WordDelimiterFilter to create my own Camel Case analyzer, I was
using the configuration flag:

flags |= GENERATE_WORD_PARTS;
flags |= SPLIT_ON_CASE_CHANGE;
flags |= PRESERVE_ORIGINAL;


But I realize that one of the side effect for using the
SPLIT_ON_CASE_CHANGE is it also tokenize the string with underscore.

I am wondering how can I prevent it to tokenize the string with underscores?




Sincerely,

--Xiaolong

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message