cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Lundquist>
Subject Character encoding problem with latest Saxon + Cocoon 2.1
Date Wed, 31 Dec 2008 01:11:51 GMT

Hi All,

I have a Cocoon 2.1.8 application using Saxon, and after a very long  
time with no problems, we just stumbled upon a bug in the (now rather  
old) version of Saxon that we were using.  So I downloaded the latest  
Saxon distribution and swapped out the Saxon JAR, and... problem  
solved!  Except that now, I seem to have a new problem with character  

Cocoon is serving a web page with a bunch of occurrences of the  
"ndash" character (Unicode #8211).  These displayed correctly with the  
old Saxon, but now with the new version they instead look like this:


:-(.  The HTMLSerializer is adding the correct

	<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

The declaration for that component did not include any <encoding>  
element.  I tried adding one like this:


which had no effect.  But then just for giggles I tried


...and discovered that this "fixed" the bad characters.  (It also  
changes the <meta http-equiv="Content-Type"> element generated by the  
serializer).  Which is interesting, but not really what I want; I want  
to be all UTF-8.  So I reverted back to 'utf-8' in the <encoding> of  
the HTMLSerializer  configuration and kept fiddling around.  I tried  

	<xsl:output format="xml" encoding="utf-8"/>

to my stylesheets, but that had no effect.  In addition, I then also  
tried adding 'encoding="UTF-8"' to the <?xml?> preamble of my source  
document, and that also had no effect.

Anybody have any clues to share?


View raw message