lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: How to index Chinese text?
Date Mon, 13 Jun 2005 17:25:23 GMT

On Jun 13, 2005, at 12:01 PM, Zsolt Koppany wrote:

> Our application works with lucene-1.4.3 stable even for German text  
> but we
> have problems with Chinese text. Which analyzer should we use to index
> Chinese text?

This question is best posted to java-user, not java-dev, but I'll  
reply here for now.

The answer is that "it depends" on what you want to do.   
StandardAnalyzer will tokenize CJK characters individually.  In the  
contrib area of the Subversion repository under "analyzers", there is  
a ChineseAnalyzer and a CJKAnalyzer.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message