lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Re: Is StandardAnalyzer good enough for multi languages...
Date Wed, 09 Jan 2013 10:09:56 GMT
On Wed, Jan 9, 2013 at 5:25 PM, Steve Rowe <sarowe@gmail.com> wrote:
> Dude.  Go look.  It allows for per-script specialization, with (non-UAX#29) specializations
by default for Thai, Lao, Myanmar and Hewbrew.  See DefaultICUTokenizerConfig.  It's filled
with exactly the opposite of what you were describing.

I guess that's a reasonable start. Still has no specialisation for
straight Roman script, but I guess it could always be added.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message