cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Portier <...@outerthought.org>
Subject Re: encoding problem
Date Thu, 06 Nov 2003 21:09:53 GMT


Leszek Gawron wrote:
> On Wed, Nov 05, 2003 at 04:33:51PM +0100, Leszek Gawron wrote:
> 
>>Current CVS version of cocoon:
>>
>>ServerPagesGenerator.generate(): java.lang.RuntimeException:
>>org.xml.sax.SAXException: Attempt to output character of integral value 346
>>that is not represented in specified output encoding of .
>>
>>I am using UTF-8 everywhere and switching to current cocoon cvs version either
>>displays the above error or just messes up all polish characters in my html
>>pages.
> 

hm, I did some minor tests with funny chars in XML serialized into 
ISO-8859-1 encoded files and they all nicely were converted into &#....; 
  character-entities (which admittedly don't look that nice, but at 
least 'work')

I did not use any polish characters though

the only problem I would expect is when your polish characters need to 
show up in the xmlnames of elements and attributes: there you can't have 
character-entities and thus the file-encoding must just be right...


> I am sorry. This has got nothing to do with cocoon. It's tomcat's 4.1.29
> fault. It sets Content-Type header to ISO-8859-1 .. strange
> 	lg

Are you sure?
I'm afraid this _has_ something to do with cocoon...


pls check
- discussion:
http://marc.theaimsgroup.com/?t=106760662600010&r=1&w=2

- recent commit:
http://marc.theaimsgroup.com/?l=xml-cocoon-cvs&m=106789462214616&w=2

and give your opinion...


(I am a bit condfused by the ServerPagesGenerator part in this, but I 
guess it's just about having only a piece off the stacktrace?)

but anyway:

here and now you can safely get back to utf-8 on the serialized output 
(and consistently also change the encoding for 
request-parameter-encoding) by changing the 'form-encoding' init-param 
to the cocoon-servlet in the web.xml of cocoon.

alternatively you can re-create the former incosistent behaviour by only 
setting the utf-8 encoding on the html serializer.

(any of the above quick tests will probably tell us fast if this is 
indeed cocoon related or not)

HTH
-marc=
-- 
Marc Portier                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at              http://radio.weblogs.com/0116284/
mpo@outerthought.org                              mpo@apache.org


Mime
View raw message