Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Ws Wiki" for change notification.
The following page has been changed by DineshPremalal:
http://wiki.apache.org/ws/tspp
------------------------------------------------------------------------------
=== Utf-8 and Utf-16 Support for tspp ===
-
+ Now tspp can parse both Utf-8 and Utf-16 Encoded documents well. It determines the encoding
support using BOM ( Byte Order Mark).
+
+ Utf-8 - 0xef 0xbb 0xbf
+
+ Utf-16 - 0xff 0xfe (Little Endian)
+
+ 0xfe 0xff (Big Endian)
+
+ For time being it supports only Utf-16 character which is smaller than 65535 ( 2 Bytes
Characters).
+
+
+ It gives Utf-8 encoded strings as Out put , Just in case user wants to get Utf-16 Out
put then user should give a macro variable at t the compile time
+
+ ==== UNICODE_OUT ====
+
+ User have to give this compile time variable in order to obtain Utf-16 Output.
Because parser's Default encoding is set to Utf-8.
+ After giving this compile time variable , then parser re-defines its , output
method (char *toString(UTF16_char unicodeState)) in order to give Utf-16 output
premalal@opensource.lk
|