lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: encoding problem when retrieving document field value
Date Mon, 03 Mar 2014 17:44:08 GMT
What is the hex value for that second character returned that appears to 
display as an apostrophe? Hex 92 (decimal 146) is  listed as "Private Use 
2", so who knows what it might display as. All that is important is the 
binary/hax value.

Out of curiosity, how did your application come about picking a PU Unicode 
character?

-- Jack Krupansky

-----Original Message----- 
From: G.Long
Sent: Monday, March 3, 2014 12:09 PM
To: java-user@lucene.apache.org
Subject: encoding problem when retrieving document field value

Hi :)

My index (Lucene 3.5) contains a field called title. Its value is
indexed (analyzed and stored) with the WhitespaceAnalyzer and can
contains html entities such as &#146; or &#176;

My problem is that when i retrieve values from this field, some of the
html entities are missing.
For example :

Luke tells me that the stored value is : "l&#146;application n&#176;
90-1258" and when I retrieve the field value in my application, I get
"l’application n° 90-1258".

The apostrophe is not in the returned value whereas the ° character is
present.

What could be the problem?

Thanks,

Gary



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message