lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From asitag <>
Subject Chinese Japanese Korean Indexing issue Version 2.4
Date Thu, 10 Sep 2009 17:52:27 GMT


We are trying to index html files which have japanese /  korean / chinese
content using the CJK analyser. But while indexing we are getting Lexical
parse error. Encountered unkown character. We tried setting the string
encoding to UTF 8 but it does not help.

Can anyone please help. Any pointers will be highly appreciated. 

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message