activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bhavani Iyer <bhavan...@gmail.com>
Subject Re: [jira] Created: (AMQCPP-261) Handle Multibyte Strings or Strings encoded in Charsets other than US-ASCII
Date Wed, 20 Jan 2010 17:10:02 GMT

Hi,

I recently migrated from 2.1.3 to 3.1.0 and found that I can no longer send
UTF 8 multibyte characters in the payload of a TextMessage. As a workaround
until this issue is resolved, I modified ActiveMQTextMessage::getText()
method to return the raw bytes from getContents() and bypassing the call to
OpenWireConnector::readString().

Here is the modified ActiveMQTextMessage::getText()

if( this->text.get() != NULL ) { return *( this->text.get() ); } else {

if( this->getContent().size() <= 4 ) { return ""; }
//// to get around ASCII text restriction
return std::string( (const char*)&getContent()[4], getContent().size()-4 );
}

Please let me know if this has any unintended consequence.
Thanks

JIRA jira@apache.org wrote:
> 
> Handle Multibyte Strings or Strings encoded in Charsets other than
> US-ASCII
> ---------------------------------------------------------------------------
> 
>                  Key: AMQCPP-261
>                  URL: https://issues.apache.org/activemq/browse/AMQCPP-261
>              Project: ActiveMQ C++ Client
>           Issue Type: Improvement
>           Components: CMS Impl, Decaf, Openwire
>     Affects Versions: 3.0.1
>             Reporter: Timothy Bish
>             Assignee: Timothy Bish
> 
> 
> The CMS API defines the interface for Strings in the TextMessage using the
> C++ std::string and const char* primitives and doesn't consider character
> encodings in its interface or the use of multibyte string representations.  
> 
> In order to allow the use of Strings between Java and C++ and .NET clients
> the strings in the TextMessage as well as those in MapMessage,
> StreamMessage, and BytesMessage (when wreiteUTF and readUTF are called) as
> well as message properties of the string type are encoded in the JAVA
> standard Modified UTF-8 format for serialized strings.  This design makes
> the assumption that strings passed are in US-ASCII format and that the
> strings from the broker are also encoded with no char values greater than
> 255 and throws an exception if one is encountered.  
> 
> The CMS interface needs to be extended to allow for more flexible string
> handling and offer a mechanism to deal with string encodings other than
> ASCII. 
> 
> Another alternative is to change the assumption about strings in the CMS
> API to assume that all string are given as either ASCII strings with chars
> < 127 and no embedded nulls or are already encoded by the user as Modified
> UTF-8 by the user so that a Java or .NET client can read all strings sent
> in CMS Messages as well.
> 
> 
> 
> 
> 
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/-jira--Created%3A-%28AMQCPP-261%29-Handle-Multibyte-Strings-or-Strings-encoded-in-Charsets-other-than-US-ASCII-tp25265866p27245213.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.


Mime
View raw message