ws-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Ws Wiki] Update of "tspp" by DineshPremalal
Date Tue, 03 May 2005 11:49:11 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Ws Wiki" for change notification.

The following page has been changed by DineshPremalal:

  === Utf-8 and Utf-16 Support for tspp ===
+    Now tspp can parse both Utf-8 and Utf-16 Encoded documents well. It determines the encoding
support using BOM ( Byte Order Mark).
+       Utf-8   -   0xef 0xbb 0xbf
+       Utf-16   -   0xff 0xfe (Little Endian)
+                  0xfe 0xff (Big Endian)
+   For time being it supports only Utf-16 character which is smaller than 65535 ( 2 Bytes
+   It gives Utf-8 encoded strings as Out put , Just in case user wants to get Utf-16 Out
put then user should give a macro variable at t the compile time
+ ==== UNICODE_OUT ====
+            User have to give this compile time variable in order to obtain Utf-16 Output.
Because parser's Default encoding is set to Utf-8. 
+            After giving this compile time variable , then parser re-defines its , output
method (char *toString(UTF16_char unicodeState)) in order to give Utf-16 output

View raw message