axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carsten Burghardt" <cars...@cburghardt.com>
Subject Re: Encoding problem
Date Mon, 11 Aug 2008 14:40:40 GMT
Quoting "WJ Krpelan" <krpelan_wj@yahoo.com>:

> Hi,
> hope I got this right.
> The encoding with &#<hex>;  looks perfect to me.
> You should check wether the actual hex-values correspond to the  
> UNICODE-CODEPONTS of you Russian Characters.

Hmm, how do I do this?

> If this is the case, how did you verify the characters were broken  
> inside the DOM-tree. Is your tool capable of showing Russiaan  
> characters?

Yes, I debugged it with Eclipse therefore I could see that the  
characters were not displayed correctly.

> Broken would mean that the numeric values in your UTF-8 XML do not  
> correspond to the UTF-8-values of your Russian Characters, which are  
> quite different from the UNICODE-Codepoints.
>
> HTH,
> Wolfgang
>
>
>
>
>
> --- On Fri, 8/8/08, Carsten Burghardt <carsten@cburghardt.com> wrote:
>
>> From: Carsten Burghardt <carsten@cburghardt.com>
>> Subject: Encoding problem
>> To: axis-dev@ws.apache.org
>> Date: Friday, August 8, 2008, 1:51 PM
>> Hi,
>>
>> first of all I know that this is more a question for the
>> user list but
>> nobody could help me there - so apologies but I'll try
>> as I don't know
>> how to continue. I've a webservice (Axis 1.4) that
>> connects to an
>> Alfresco server and stores metadata from emails (like
>> subject, sender,
>> ...). This works fine with ISO-* or UTF-8 encoded emails.
>> But once I
>> have an email with more "exotic" character sets
>> like KOI8-R (russian)
>> I get an error on the server side because of invalid
>> characters (like
>> 0x1e). I know that no control characters are in the content
>> so I
>> watched the traffic with tcpmon and noticed that all
>> characters were
>> totally screwed up.
>> So I traced the Axis code and saw that the characters were
>> encoded
>> with &#<hex>; in the SoapBody. Afterwards the DOM
>> tree is serialized
>> in the DoAllSender class and then the characters are broken
>> in the
>> generated XML. When I switched the encoding of the Soap
>> Message to
>> KOI8-R instead of UTF-8 the characters showed up fine in
>> the tcpmon
>> but then the server reports an error about a different
>> illegal
>> character (0x1) which is probably because the message is
>> converted to
>> UTF-8 at a certain step.
>> So I guess my questions is: what is the proposed way to
>> transmit those
>> characters to a webservice (apart from Base64 encoding
>> them...)?
>>
>> Many thanks
>>
>> Carsten
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
>> For additional commands, e-mail:
>> axis-dev-help@ws.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
> For additional commands, e-mail: axis-dev-help@ws.apache.org
>
>





---------------------------------------------------------------------
To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-dev-help@ws.apache.org


Mime
View raw message