Hi,
I want to implement a custom index rule:
Assume the sentence like the following:Note comma
I am in China,I am in USA,I am in UK
I hope lucene index above sentece based on the rule:
1)split the sentence with comma(,),so we get(I am in China)(I am in USA)(I am in UK)
2)then lucene just store the short senteces from step 1,NOT_ANALYZED
P.S How many characters lucene do not support,and What they are?
I input a^b and get exception:
org.apache.lucene.queryParser.ParseException: Cannot parse 'a^b: Lexical error at line 1,
column 4. Encountered: "\u671d" (26397), after : ""
thanks
2011-10-24
janwen | China
website : http://www.qianpin.com/ |