cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno Dumon <br...@outerthought.org>
Subject RE: patch to Enhance CIncludeTrasnformer to handle encodingofparameters
Date Tue, 02 Mar 2004 16:45:46 GMT
On Tue, 2004-03-02 at 14:57, Marco Dubbeld wrote:
> On Tue, 2004-03-02 at 13:29, Carsten Ziegeler wrote:
> > Marco Dubbeld wrote:
> > > 
> > > Yep! The value element may contain UTF-8, with chinese 
> > > characters or other non ISO-8859 encoding characters. While 
> > > testing, the 
> > > 
> > > this.startSerializedXMLRecording(XMLUtils.defaultSerializeToXM
> > > LFormat(true));
> > > will use ISO-8859 encoding (see the properties given back 
> > > from XMLUtils). However we should use a property set with the 
> > > encoding from the document we are transforming. Otherwise 
> > > this causes UTFDataFormatException for chinese UTF-8 for example.
> > 
> > So the best way would be to pass the encoding of the document
> > to the XMLUtils used for serializing. Is this possible?
> Properties props = XMLUtils.defaultSerializeToXMLFormat(true);
> String encoding = ??????????
> props.set(OutputKeys.ENCODING, encoding);
> this.startSerializedXMLRecording(props);
> 
> How to determine most properly the encoding of the events or the input
> source I do not precisly know.
> java.sun.com is down from my location so one half of my brains is
> blocked. If it's back I try to search. 
> 
> Maybe someone on the list knows ?

You just need to choose something. I don't think SAX provides
information about the encoding of the original document. Always using
UTF-8 should be a safe choice.

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
bruno@outerthought.org                          bruno@apache.org


Mime
View raw message