lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheolgoo Kang <app...@gmail.com>
Subject Re: korean and lucene
Date Tue, 04 Oct 2005 01:11:43 GMT
StandardAnalyzer's JavaCC based StandardTokenizer.jj cannot read
Korean part of Unicode character blocks.

You should 1) use CJKAnalyzer or 2) add Korean character
block(0xAC00~0xD7AF) to the CJK token definition on the
StandardTokenizer.jj file.

Hope it helps.


On 10/4/05, John Wang <john.wang@gmail.com> wrote:
> Hi:
>
> We are running into problems with searching on korean documents. We are
> using the StandardAnalyzer and everything works with Chinese and Japanese.
> Are there known problems with Korean with Lucene?
>
> Thanks
>
> -John
>
>


--
Cheolgoo

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message