httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Orton <jor...@redhat.com>
Subject Re: mod_dav and EBCDIC
Date Wed, 23 Nov 2005 15:11:23 GMT
On Wed, Nov 23, 2005 at 09:37:04AM -0500, Jeff Trawick wrote:
> On 11/23/05, Joe Orton <jorton@redhat.com> wrote:
> > On Sun, Nov 20, 2005 at 09:53:50AM -0500, Jeff Trawick wrote:
> > > On input path, ap_xml_parse_input() handles converting xml to native
> > > charset (at least in 2.2).  On output, there is no provision for
> > > converting xml in responses.
> >
> > OK, pop quiz: how is a Unicode XML document getting converted into
> > EBCDIC on input without losing most of the character set along the way?
> 
> unclear to me, at least...
> 
> For this code:
> 
> server/util_xml.c::ap_xml_parse_input():
> ...
> #if APR_CHARSET_EBCDIC
>     apr_xml_parser_convert_doc(r->pool, *pdoc, ap_hdrs_from_ascii);
> #endif
> ...
> 
> The xml library apparently parses the input it well enough to
> understand the nodes.  After that, it looks like the charset
> translation specified here (ap_hdrs_from_ascii) should use the real
> charset specified by the client.  As it is, interesting* characters
> won't be handled correctly.

It looks like ap_hdrs_from_ascii is always an ISO-8859-1->EBCDIC xlate 
handle, so this is really quite broken.  The document will always be in 
UTF-8 as returned by the XML parser.  At very least I guess it could try 
to do a UTF-8->EBCDIC conversion and fail if that's not possible.

Sorting this stuff out is really a prerequisite to working out how to 
get the output side right, I'd say...

joe

Mime
View raw message