cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: Output encoding fixes
Date Tue, 04 Apr 2000 18:55:03 GMT
Richard Hieber wrote:
> 
> Hi,
> 
> I just downloaded Cocoon 1.7.3-dev from the CVS, compiled and installed.
> Stefano has fixed two bugs today related to output encoding. For me
> something still doesn't work out right. But I can't even claim to understand
> the problem fully so please bear with me.
> 
> Let me give an example:
> 
> umlaut-page.xml
> ------------------------------------------------------------------
> <?xml version="1.0" encoding="iso-8859-1"?>
> <?xml-stylesheet href="umlaut-page-result.xsl" type="text/xsl"?>
> <?cocoon-process type="xslt"?>
> 
> <page>
>  <content>This page contains German Umlaut chars: äöüÄÖÜ</content>
> </page>
> ------------------------------------------------------------------
> 
> umlaut-page-result.xsl
> ------------------------------------------------------------------
> <?xml version="1.0"?>
> 
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> 
>   <xsl:template match="page">
>    <xsl:processing-instruction
> name="cocoon-format">type="text/xml"</xsl:processing-instruction>
>    <result>
>      <xsl:value-of select="content"/>
>    </result>
>   </xsl:template>
> 
> </xsl:stylesheet>
> ------------------------------------------------------------------
> 
> Cocoon processes the XSLT-Transformation and gives back the following
> result:
> ------------------------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <result>This page contains German Umlaut chars: äöüÄÖÜ</result>
> 
> <!-- This page was served in 0 milliseconds by Cocoon 1.7.3-dev -->
> ------------------------------------------------------------------
> 
> What bugs me is that the encoding of the result is "UTF-8", not "iso-8859-1"
> like it should (IMO).
> The consequence is that Internet Explorer chokes on the Umlaut characters
> and won't display the XML tree but shows an error message instead.
> 
> Nothing changes even if I put the encoding attribute in the XML declaration
> of the stylesheet or inside the "cocoon-format" processing instruction.
> 
> BTW, is it "iso-8859-1" or "ISO-8859-1"? Or maybe both is correct?

Hmmm, there is probably something that you are missing, since this may
not be a bug.

Today I fixed two bugs with encoding, one for intput and one for output,
but they are not related. In fact, it's not obvious that the encoding
that you use for input is the same for output. It's not obvious and it
should not be.

By writing 

  <?cocoon-format type="text/xml"?>

you are basically saying: 
 
  "format the this document using the formatter associated with the type
"text/xml"

nothing more. To make the "text/xml" formatter output something using
the "iso-8859-1" encoding, you must say so. How? add these lines in your
cocoon.properties

  formatter.text/xml.encoding = iso-8859-1

which tells your formatter to use that encoding for output,
independently on what encoding was used to serialize the XML document as
input.

Hope this helps.

BTW, if the above doesn't work, then, yes, we have a bug :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------


Mime
View raw message