activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From atani <jsteven...@bepress.com>
Subject Re: Non-ASCII messages via Stomp being dropped or mangled in 5.5
Date Thu, 21 Jul 2011 19:59:57 GMT

Hi Dejan,


UTF-8 encoded characters can be 1 to 3 bytes in length; characters less than
or equal to 127 (hex 0x7F) is one byte, 128 to 2048 (0x07FF) is two bytes,
2049 to 65535 (0xFFFF) is three bytes.  e.g. the smiley character in my test
encodes to three bytes (0xE2, 0x98, and 0xBB) and the Ö encodes to two (0xC3
and 0x96).  From what I can tell the stomp messages contain the correct
bytes in the body when I'm sending them with a content-length header.


The Stomp message contains the correct bytes in the body when they're being
sent with a content-length header.  As text, the utf8 message is:

Wed Jul 20 10:32:50 2011 utf8 encoded: unicode >☻< smiles - 57 characters
long


but in the stomp "bytes message" it is:

Wed Jul 20 10:32:50 2011 utf8 encoded: unicode >☻< smiles - 59 bytes long,
which is the length reported in the content-length header.



In the Stomp protocol I'm not sure what you mean by "convert it explicitly
to a byte array".  The Stomp message being sent with the content-length
header contains the bytes that result from encoding the message text in
utf-8 and the content-length header reports that number of bytes accurately. 
Unfortunately I can't get the full message / actual bytes received logged in
the ActiveMQ stomp log - is there another class that I could crank the
logging up on to get the incoming message logged in full?



--
View this message in context: http://activemq.2283324.n4.nabble.com/Non-ASCII-messages-via-Stomp-being-dropped-or-mangled-in-5-5-tp3679601p3684826.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message