axis-c-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stadelmann Josef" <josef.stadelm...@axa-winterthur.ch>
Subject AW: encoding problem U+00C0 to U+00E7 are not parsed by libxml2_reader_wrapper.c;36(886)
Date Thu, 10 Jun 2010 14:17:24 GMT
Dear Nandika

 

All is OK J ....

My problem is gone; 

 

1.	I toke Ethereal to trace the network traffic to find that the client is the one sending
rubbish, but not UTF-8
2.	I corrected the client, I removed out the encoding routine and I trusted that WCF 3.5 encodes
UTF-16 to UTF-8 because I asked for an UTF-8 encoding in my custom binding 
3.	I TRUSTED my custom binding asking for UTF-8 encoding
4.	Then I checked again with Ethereal and found that the client sends all correct

	a.	U+0020 to U+007F is sent OK
	b.	U+00A0 to U+00BF is sent OK
	c.	U+00C0 to U+00E7 is sent OK
	d.	U+00E8 to U+00FF is sent OK too

5.	Then I had a look at my Axis2 Java web service and found that it had already converted
my sent UTF-8 xml stream into an ISO-8859-1 xml stream
6.	After passing this received stream to my legacy C code (via JNI) it arrived as ISO-8859-1
xml stream
7.	Then I added  <?xml version='1.0' encoding='ISO-8859-1'?> to the front of this received
payload to indicate this to libxml2_reader_wrapper.c
8.	libxml2_reader_wrapper.c did not any longer complain, and all is well received as ISO-8859-1
at my legacy PASCAL code in OpenVMS
9.	on the way back
10.	during de-serialization the hash table data is moved into an OM model and encoded at that
time to UTF-8 (this will be taken out shortly)
11.	the OM model is de-serialized using axiom and AXUTIL from axis2/C (all does well)
12.	the resulting xml stream my payout payload is passed back through JNI and arrives as UTF-8
encoded xml stream in my Axis2 JAVA service
13.	there it is converted using Java AXIOM into an OM Model and returned to the caller, the
Axis2 engine
14.	it is returned as UTF-8 encoded data to my WCF 3.5 client (because the client custom binding
has asked for an UTF-8 binding)
15.	at the WCF client it gets automatically decoded into UTF-16 which is the internal default
for .NET frameworks   
16.	Round trip well done!

Learning: Use Ethereal or Wireshark or any other network analyzer to see the exact encoding
of your characters in bytes on the wire. 
It helps you to identify if it is the client or the server or both J

 

So, Nandika, it is me a pleasure to say thanks to you for your offered help again. And LIBXML2
seems to be well done, even I have code ported to VMS now 4 years old.

 

Josef Stadelmann
@axa-winterthur.ch

 

 

Von: Nandika Jayawardana [mailto:jayawark@gmail.com] 
Gesendet: Donnerstag, 10. Juni 2010 07:15
An: Apache AXIS C User List
Betreff: Re: encoding problem U+00C0 to U+00E7 are not parsed by libxml2_reader_wrapper.c;36(886)

 

This could be due to a Libxml issue. May be you can try out a newer version of libxml. In
the libxml2_reader_wrapper there isn't any code related to character coding. It just calls
the libxml2 reader methods and returns the output. 

 

Regards

Nandika

On Wed, Jun 9, 2010 at 11:40 PM, Stadelmann Josef <josef.stadelmann@axa-winterthur.ch>
wrote:

Dear community

I have an encoding problem somewhere deep in axiom or axutil or libxml2!

The green text below shows you how the UTF-8 character U+00BF arrives in "191 ¿", and how
it correctly translates

In short the show can go on 1. to 13. where the transaction in my web service ends.

 

Furher down you see a next  transaction and what happens when the UTF-8 Character U+00C0 arrives
in "192 Ã?", and how it gets not read 

[Wed Jun  9 19:38:47 2010] [error] DKB3:[SW-PROJEKTE.webservices.axis2.trunk.c.axiom.src.parser]libxml2_reader_wrapper.c;36(886)


Input is not proper UTF-8, indicate encoding !

Bytes: 0xC3 0x3F 0x3C 0x2F

 -- SEVERITY_ERROR

When you look at my 2 inputs to be parsed called <payin></payin>, I consider it
well formed, and I indicate the encoding. 

So what can be the reason that libxml2_reader_wrapper.c;36(886) bails out?

I may have to go for a later libxml2 source, or axis2/C sources and build all from source/scratch
on my OpenVMS system because it can well be that something is out-dated and that patches are
applied since my project started in 2006.

I short - I am running character transmission tests doing a Chr(i) where I goes for the ASCII
code points and the valid UTF-8 code points, in fact those UTF-8 chars we need to convert
to ISO8859-1 after the read.

The following code points are parsed correctly from my Windows Vista WCF Client to my Axis2
Web Service Server and through the JVM's JNI to some C legacy code:

U+0020 to U+007F are OK

U+00A0 to U+00BF are OK

U+00C0 to U+00E7 fails as shown above

U+00E8 to U+00FF are OK too

All the remaining control characters lower and upper control set are by intention not yet
sent!

What can be the reason that U+00C0 to U+00E7 fails in the parser?

Josef

[Wed Jun  9 19:38:45 2010] [info]

 1. Spg-Legacy fktmap() starts here ------- 1

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2213)

 2. P1 has a current size of: 648, and a value of:

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2221)
this string gets

 parsed and converted to an OMElement using axis2/c routines:

<?xml version='1.0' encoding='utf-8'?><payin xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns:b="SPS-Payload" xmlns="http:

//spezpla.axawl.ch"><b:Name>SPS-Payload</b:Name><b:wscol xmlns:c="COL"><c:Name>defaultNameWSCOL</c:Name><c:Item
xmlns:d="http://sche

mas.datacontract.org/2004/07/WCFSpSe" xmlns:e="WS"><d:TWS><e:Name>wsinp</e:Name><e:Item
xmlns:f="feld"><d:TELEM><f:Fldnam>TARGET</f:

Fldnam><f:Fldval>FKT_LOOPBACK</f:Fldval></d:TELEM><d:TELEM><f:Fldnam>SELECT</f:Fldnam><f:Fldval>191
¿</f:Fldval></d:TELEM></e:Item>

</d:TWS><d:TWS><e:Name>wsold</e:Name><e:Item xmlns:f="feld" /></d:TWS><d:TWS><e:Name>wsout</e:Name><e:Item
xmlns:f="feld" /></d:TWS>

</c:Item></b:wscol></payin>

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1382)

 axawl_deserialize_input_payload() ....  2

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1561)

 key 1  : 'TARGET'

 val 1  : 'FKT_LOOPBACK'

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1561)

 key 2  : 'SELECT'

 val 2  : '191 ¿'

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1589)

 got htwsinp from htWSCOL using "wsinp" string constant as key

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1593)

 got htwsold from htWSCOL using "wsold" string constant as key

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1606)

 axawl_deserialize_input_payload() completed  2

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2231)

 3. axawl_deserialize_input_payload() completed

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2239)

 4. axawl_put() completed

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2244)

 5. axawl_merge() completed

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2257)

 6. PWKSP logging on input

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2261)

 key 1  : 'SELECT'

 val 1  : '191 ¿'

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2261)

 key 2  : 'TARGET'

 val 2  : 'FKT_LOOPBACK'

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2263)

 7. PWKSP logging completed, WRAP() has input

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2291)

 8. fktmap()->WRAP() .....

WRAP-BEG------------------

 TARGET: FKT_LOOPBACK

 SELECT: 191 ¿

WRAP-END------------------

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2293)

 9. fktmap()->WRAP() completed

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2299)

 10. axawl_ret() completed

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1628)

 axawl_serialize_output_payload() .....

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1765)

 Converting 2 key's and value's in htwsout using isolat1ToUTF8()

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1810)

 key 1  : 'SELECT'

 val 1  : '191 ¿'

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1810)

 key 2  : 'TARGET'

 val 2  : 'FKT_LOOPBACK'

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1822)

 Serialize root_node to om_output 2

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1831)

 root_node serialization SUCCESS 2

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1847)

 call axawl_dump_output_buffer(output_buffer)

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1892)

 End axawl_serialize_output_payload()

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2304)

 11. axawl_serialize_output_payload() completed

[Wed Jun  9 19:38:45 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2322)

 12. payload_out to parameter P2 copy completed (599 bytes)

[Wed Jun  9 19:38:45 2010] [info]

 13. Spg-Legacy fktmap() ends   here ------- 1

[Wed Jun  9 19:38:47 2010] [info]

 1. Spg-Legacy fktmap() starts here ------- 1

[Wed Jun  9 19:38:47 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2213)

 2. P1 has a current size of: 648, and a value of:

[Wed Jun  9 19:38:47 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2221)
this string gets

 parsed and converted to an OMElement using axis2/c routines:

 <?xml version='1.0' encoding='utf-8'?><payin xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns:b="SPS-Payload" xmlns="http:

//spezpla.axawl.ch"><b:Name>SPS-Payload</b:Name><b:wscol xmlns:c="COL"><c:Name>defaultNameWSCOL</c:Name><c:Item
xmlns:d="http://sche

mas.datacontract.org/2004/07/WCFSpSe" xmlns:e="WS"><d:TWS><e:Name>wsinp</e:Name><e:Item
xmlns:f="feld"><d:TELEM><f:Fldnam>TARGET</f:

Fldnam><f:Fldval>FKT_LOOPBACK</f:Fldval></d:TELEM><d:TELEM><f:Fldnam>SELECT</f:Fldnam><f:Fldval>192
Ã?</f:Fldval></d:TELEM></e:Item>

</d:TWS><d:TWS><e:Name>wsold</e:Name><e:Item xmlns:f="feld" /></d:TWS><d:TWS><e:Name>wsout</e:Name><e:Item
xmlns:f="feld" /></d:TWS>

</c:Item></b:wscol></payin>

[Wed Jun  9 19:38:47 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1382)

 axawl_deserialize_input_payload() ....  2

[Wed Jun  9 19:38:47 2010] [error] DKB3:[SW-PROJEKTE.webservices.axis2.trunk.c.axiom.src.parser]libxml2_reader_wrapper.c;36(886)


Input is not proper UTF-8, indicate encoding !

Bytes: 0xC3 0x3F 0x3C 0x2F

 -- SEVERITY_ERROR

[Wed Jun  9 19:38:47 2010] [error] DKB3:[SW-PROJEKTE.webservices.axis2.trunk.c.axiom.src.parser]libxml2_reader_wrapper.c;36(427)
 er

ror occured in reading xml stream

[Wed Jun  9 19:38:47 2010] [error] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1417)

 Document root is NULL] =  when it is not supposed to be NULL

[Wed Jun  9 19:38:47 2010] [error] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1418)

 unable to create root node from om_document

[Wed Jun  9 19:38:50 2010] [info]

 1. Spg-Legacy fktmap() starts here ------- 1

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2213)

 2. P1 has a current size of: 628, and a value of:

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2221)
this string gets

 parsed and converted to an OMElement using axis2/c routines:

 <?xml version='1.0' encoding='utf-8'?><payin xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns:b="SPS-Payload" xmlns="http:

//spezpla.axawl.ch"><b:Name>SPS-Payload</b:Name><b:wscol xmlns:c="COL"><c:Name>defaultNameWSCOL</c:Name><c:Item
xmlns:d="http://sche

mas.datacontract.org/2004/07/WCFSpSe" xmlns:e="WS"><d:TWS><e:Name>wsinp</e:Name><e:Item
xmlns:f="feld"><d:TELEM><f:Fldnam>TARGET</f:

Fldnam><f:Fldval>FKT_BYE</f:Fldval></d:TELEM><d:TELEM><f:Fldnam>SELECT</f:Fldnam><f:Fldval
/></d:TELEM></e:Item></d:TWS><d:TWS><e:Na

me>wsold</e:Name><e:Item xmlns:f="feld" /></d:TWS><d:TWS><e:Name>wsout</e:Name><e:Item
xmlns:f="feld" /></d:TWS></c:Item></b:wscol><

/payin>

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1382)

 axawl_deserialize_input_payload() ....  2

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1561)

 key 1  : 'TARGET'

 val 1  : 'FKT_BYE'

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1561)

 key 2  : 'SELECT'

 val 2  : ''

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1589)

 got htwsinp from htWSCOL using "wsinp" string constant as key

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1593)

 got htwsold from htWSCOL using "wsold" string constant as key

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1606)

 axawl_deserialize_input_payload() completed  2

[Wed Jun  9 19:38:50 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2231)

 3. axawl_deserialize_input_payload() completed

[Wed Jun  9 19:38:51 2010] [info]

 1. Spg-Legacy logout() starts here ------- 1

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2370)

 2. P1 has a current size of: 435, and a value of:

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1382)

 axawl_deserialize_input_payload() ....  2

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1561)

 key 1  : 'TARGET'

 val 1  : 'FKT_BYE'

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(1606)

 axawl_deserialize_input_payload() completed  2

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2380)

 3. axawl_deserialize_input_payload() completed

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2388)

 4. axawl_put() completed

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2393)

 5. after axawl_merge() completed

[Wed Jun  9 19:38:51 2010] [debug] DKB3:[SW-PROJEKTE.SPEZSRVW.axawl.spezpla.servers.SPgServer]SPg-legacy.c;17(2405)

 6. logging PWKSP on input

IA64>




-- 
http://nandikajayawardana.blogspot.com/
WSO2 Inc: http://www.wso2.com


Mime
View raw message