axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Jencks (JIRA)" <axis-...@ws.apache.org>
Subject [jira] Updated: (AXIS-1971) problem with BOM and character set encoding
Date Mon, 02 May 2005 22:18:06 GMT
     [ http://issues.apache.org/jira/browse/AXIS-1971?page=all ]

David Jencks updated AXIS-1971:
-------------------------------

    Attachment: MessageContext.diff

Patch to set the character set encoding on the response as soon as possible.

> problem with BOM and character set encoding
> -------------------------------------------
>
>          Key: AXIS-1971
>          URL: http://issues.apache.org/jira/browse/AXIS-1971
>      Project: Axis
>         Type: Bug
>   Components: Serialization/Deserialization
>     Versions: current (nightly)
>     Reporter: David Jencks
>  Attachments: MessageContext.diff
>
> I'm encountering this problem in the geronimo axis integration, so it's possible that
it is not an axis bug, but I don't see how.
> I send a UTF-16 character set encoded message to the server, and get back a message that
starts with a byte order mark but claims to be UTF-8.
> I've copied the code from AxisServlet that sets the character encoding on the response
to the equivalent place in geronimo code.
> After tracing through what is happening, I find that during the return from the invoke
call, leaving the HandlerChainImpl (postInvoke line 206) the entire response is serialized
with the default UTF-8 character set encoding into a ByteArray.
> After invoke returns, the code from AxisServlet changes the character set encoding to
UTF-16 and writes out the message.  However, since the message was already serialized into
a ByteArray buffer, this apparently has the effect of writing out the byte order mark and
then the byte array that was produced using UTF-8.
> This can be fixed by making the message context set the response message character set
encoding when the response message is set on the message context (see attached patch).
> I find the logic that determines the response character set encoding byzantine and would
prefer to simplify it to the extent that I can understand how it works... I would need answers
to these questions in order to proceed:
> 1. Under what circumstances would a response message be in a different character set
encoding than a request?
> 2. Which user/application code should be able to set the response character set encoding
and how should it do so?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message