lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Seeta Somagani" <Seeta.Somag...@xplana.com>
Subject RE: Accented characters problem
Date Thu, 02 Mar 2006 21:18:49 GMT
When I had this problem, I found out that the characters that I'm entering were in UTF-8 format
and java converts numbers to a cp1252 encoding. I took care of this using xml.getBytes("UTF-8")
for writing and similarly   new String(buffer,0,bytes_read,"UTF8") for reading. This solved
my problem.

seeta

-----Original Message-----
From: David denBoer [mailto:ddenboer@apple.com] 
Sent: Thursday, March 02, 2006 4:14 PM
To: java-user@lucene.apache.org
Subject: Accented characters problem

Hi all,

We are havign a small problem searching for text with accents in the  
query. Our index has a word like 'agréé', and when we search for it,  
we get no results.

The query parses (using Snowball) to :
'name:"agr\213 \213"'

Using the ISOLatin filter, we get :
'name:agra'

neither gets any results.

When I perform the search using Luke, I get the expected results.

Is there something I am not doing right? I swear this worked with  
Lucene 1.4.3 and is not working anymore, but it has been a while...

Thanks,
David.
  
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message