axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davanum Srinivas (JIRA)" <>
Subject [jira] Resolved: (AXIS-1676) SAXParseException for message containing more than 2 bytes UTF-8 chars and DIME Attachment
Date Tue, 23 Nov 2004 00:10:24 GMT
     [ ]
Davanum Srinivas resolved AXIS-1676:

    Resolution: Fixed

Applied patch.

> SAXParseException for message containing more than 2 bytes UTF-8 chars and DIME Attachment
> ------------------------------------------------------------------------------------------
>          Key: AXIS-1676
>          URL:
>      Project: Axis
>         Type: Bug
>   Components: Serialization/Deserialization
>     Versions: 1.1
>  Environment: Detected on Axis 1.1 under win2000.
> But clearly affect newer Axis versions under any operating system, JDK etc.
> Cf. Description and patch.
>     Reporter: Damien
>  Attachments: diff.txt
> Description from a user point of view
> ---------------------------------------
> In some cases, on client side, you may get a SAXParseException for messages including
2 bytes (or more) UTF-8 characters when it includes DIME Attachment.
> This is systematic for a given message (but depends on the message content)
> But when there is no DIME Attachment or when it includes Mime Attachment it works fine.
> Description from a patch submitter point of view
> -------------------------------------------------
> The org.apache.axis.attachments.DimeDelimitedInputStream read() method is buggy.
> Whereas it should always return a positive int value, or -1 when the End Of Stream is
reached, it may return a negative value.
> This is due to a "cast error" from byte to int.
> Full analysis
> --------------
> When Xerces tries to parse a message, it first reads a buffer of a given size (2048 bytes)
using the UTF8Reader class and the read(byte[], int, int) method of the inputStream.
> This byte array is then converted to an UTF-8 char array. If ever, the last byte of the
buffer is the beginning of an UTF-8 character then one (or more) additonnal byte is requested
so as to complete this character.
> This is done through the read() method (with no parameter). In case of a message with
DIME Attachment, the input stream is a DimeDelimitedInputStream. Because the read() method
may return a negative value, the UTF8Reader may consider that the End Of Stream has been reached
(which is not the case). As a consequence, the SOAPPart is not fully passed to the parser
and the parsing fails !
> ---
> The patch is available and going to be submitted.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
If you want more information on JIRA, or have a bug to report see:

View raw message