lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Seeta Somagani" <>
Subject RE: Accented characters problem
Date Thu, 02 Mar 2006 21:18:49 GMT
When I had this problem, I found out that the characters that I'm entering were in UTF-8 format
and java converts numbers to a cp1252 encoding. I took care of this using xml.getBytes("UTF-8")
for writing and similarly   new String(buffer,0,bytes_read,"UTF8") for reading. This solved
my problem.


-----Original Message-----
From: David denBoer [] 
Sent: Thursday, March 02, 2006 4:14 PM
Subject: Accented characters problem

Hi all,

We are havign a small problem searching for text with accents in the  
query. Our index has a word like 'agréé', and when we search for it,  
we get no results.

The query parses (using Snowball) to :
'name:"agr\213 \213"'

Using the ISOLatin filter, we get :

neither gets any results.

When I perform the search using Luke, I get the expected results.

Is there something I am not doing right? I swear this worked with  
Lucene 1.4.3 and is not working anymore, but it has been a while...

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message