lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philip Puffinburger" <ppuffinbur...@tlcdelivers.com>
Subject RE: 2.3.2 -> 2.4.0 StandardTokenizer issue
Date Sat, 21 Feb 2009 16:23:53 GMT
That's something we can try.   I don't know how much it performance we'd lose doing that as
our custom filter has to decompose the tokens to do its operations.   So instead of 0..1 conversions
we'd be doing 1..2 conversions during indexing and searching.

-----Original Message-----
From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Saturday, February 21, 2009 8:35 AM
To: java-user@lucene.apache.org
Subject: Re: 2.3.2 -> 2.4.0 StandardTokenizer issue

normalize your text to NFC. then it will be \u0043 \u00F3 \u006D \u006F and
will work...


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message