lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lwl <lwl.ro...@gmail.com>
Subject Re: Lucene and Chinese language
Date Thu, 01 Jul 2010 09:30:32 GMT
yes, the StandardAnalyzer interpret each Chinese letter as one word.
Better analyzers for chinese are here:
http://hi.baidu.com/lewutian/blog/item/ca61060a06914b1394ca6b25.html

在 2010年7月1日 下午5:19,Kolhoff, Jacqueline - ENCOWAY <Kolhoff@encoway.de>写道:

>
> Hi!
>
> We are using lucene in our project to search through information objects
> which works fine. For indexing we use the StandardAnalyzer.
> Now, we have to support the Chinese language. I found out that the Chinese
> words and letters are correctly saved in the index but the query to search
> for them does not work. Example: in English language the query is “text”
> which we parse to “*text*”. If we search for Chinese words / phrases like
> “佛山东方书城”the query is “*佛山东方书城*“ but there are no search
results. If the
> query places blanks between the single letters / symbols like this “*佛 山 东 方
> 书 城*“ we are getting results. Does the StandardAnalyzer interpret each
> Chinese letter as one word? What are best practices for this case? Shall we
> use another analyzer (Chinese analyzer)? Or is it better to replace the
> query parser in this case?
>
> Regards,
> Jacqueline.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message