[ https://issues.apache.org/activemq/browse/AMQCPP-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=51117#action_51117 ] Martin Schlapfer commented on AMQCPP-235: ----------------------------------------- With respect to a TextMessage, it looks like longer strings (>65535) are supported. The length is encoded with 4 bytes rather than 2 bytes. For character values greater than 127 contained in a text message: (1) C++ to C++ Client: the (un)marshalling works (except in the case of a null character), because there is no UTF8 encoding/decoding performed. (2) C++ to Java Client: the Java client throws a java.io.UTFDataFormatException when getText is called on the text message (at this point it is "unmarshalled" using UTF8 decoding, see java.org.apache.activemq.util.MarshallingSupport.readUTF8). So it looks like an implementation for UTF8 encoding/decoding for the TextMessage is necessary to be interoperable with the Java client with character values larger than 127. > UTF8 length marshalling bug in openwire readString and writeString. > ------------------------------------------------------------------- > > Key: AMQCPP-235 > URL: https://issues.apache.org/activemq/browse/AMQCPP-235 > Project: ActiveMQ C++ Client > Issue Type: Bug > Components: Openwire > Environment: Windows XP / Visual Studio 2005 > Reporter: Martin Schlapfer > Assignee: Timothy Bish > Priority: Minor > Attachments: OpenwireStringSupport.cpp.patch, OpenwireStringSupportTest.cpp.patch, OpenwireStringSupportTest.h.patch > > > In investigating a bug for the check "if( str->size() > 65536 )" which should be "if( str->size() > 65535 )" in writeString() , I found a couple of other problems: > (1) The OpenwireStringSupport::readString method should read the utf8 length as an unsigned short rather than short. The problem is that utf8 encoded strings (using writeString) longer than 32768 will become truncated when read back using readString(). > (2) The writeString() method should also check the value of utflen after determining the UTF8 length of the encoded string, since with the support of characters greater than value 127, encodings of 2 UTF8 bytes per byte can exist. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.