lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: [Lucene] XML Indexing
Date Fri, 30 Apr 2004 12:45:37 GMT
On Apr 29, 2004, at 11:29 PM, Samuel Tang wrote:
> I've fixed the problem by myself. Thank you.

What was the solution?  Choosing a different analyzer?

I've recently been doing some work on Chinese analysis:

	http://www.blogscene.org/erik/LuceneInAction/i18n.html

But not within the context of XML.  There are obviously many variables 
in the equation (XML file encoding, the analyzer, and more).

	Erik

>
> Samuel Tang <samuel202004@yahoo.com.hk> wrote:Any comments and 
> suggestions. Please help!
>
> Note: forwarded message attached.
>
> 必殺技、飲歌、小星星...
> 浪漫鈴聲 情心連繫
> http://ringtone.yahoo.com.hk/
>
>
>> ATTACHMENT part 2 message/rfc822
> 日期: Wed, 28 Apr 2004 23:39:30 +0800 (CST)
> 寄件人: Samuel Tang
> 標題: [Lucene] XML Indexing
> 收件人:: lucene-user@jakarta.apache.org
>
> XMLIndexingDemo seems not able to index traditional Chinese 
> characters. I can only search for English text and not Chinese. In 
> fact, my XML document contains both Chinese and English text. How can 
> I fix this problem? Is it necessary for me to convert the Chinese 
> characters in BIG5 to UTF-8 before doing the file indexing? If it is, 
> then how can we do it? This problem won't happen on indexing bilingual 
> HTML files (Chinese & English) with Lucene Demo HTML parser.
>
> 必殺技、飲歌、小星星...
> 浪漫鈴聲 情心連繫
> http://ringtone.yahoo.com.hk/
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> 必殺技、飲歌、小星星...
> 浪漫鈴聲  情心連繫
> http://ringtone.yahoo.com.hk/


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message