qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Ross <justin.r...@gmail.com>
Subject Re: UTF8 / binary strings in dynamic languages
Date Wed, 21 Aug 2013 21:43:56 GMT
On Wed, Aug 21, 2013 at 4:04 PM, Jimmy Jones <jimmyjones2@gmx.co.uk> wrote:
>> 3. If the language string is an overloaded text/bytes type, as is
>> regrettably quite common, what do we do then?
>> The current answer to this question is "send it as vbin". That's very
>> safe, insofar as it won't throw any sort of encoding exception. It
>> does not, however, always honor what I think is the user's more
>> typical intention: produce an ascii string at the other end.
> I guess the problem is between dynamically and statically typed languages,
> if you stay with the same language you don't notice anything, but this
> slightly defeats the object of AMQP!

I don't think it's between dynamic and static.  It's between languages
that model text and data in one type (C-style strings) and languages
that enforce a division.  In the latter camp, there's Java and Python
3, for instance.

And this is the bug we see reported.  Someone tries to send a "string"
from the first camp to the second, and then they think something went
wrong: it went in as a string and came out as bytes.

>> So for 3, I'd like to consider the possibility of, by default, sending
>> ambiguous language strings as ascii rendered to amqp str16. This
>> requires an encoding step that may produce errors. And maybe that's
>> just too obnoxious! That's what I'd like to know.
> I'm not convinced, but I'm prepared to be convinced. If I put a binary
> value in a map and encoded it some of the time it might be valid utf8,
> other times not. Could this lead to a class of subtle bugs where a receiver
> written in a statically typed language will work most of the time when
> the value appears as a vbin, but not other times when it "accidentally"
> appears a a str16?

"If I put a binary value in a map and encoded it some of the time it
might be valid utf8, other times not."  This shouldn't be allowed to
happen, IMO.  You meant it to be a binary value--we have to find a way
to capture and preserve that information.

>> In summary, if we have a way to determine what the user wanted (text
>> or bytes), we should try to carry that through on the wire. At the
>> following URL I've tried to map out what type information we can get
>> for each language. Please update it as you please.
>>  https://cwiki.apache.org/confluence/display/qpid/Language+support+for+unambiguous+text+string+and+byte+array+types
> I've just signed up, but don't seem to be able to edit the page? I'll
> add the stuff about utf8::upgrade when I can edit.

I looked, and I don't seem to have the privileges to change your
privileges.  Anyone else have the ability to do this?


To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org

View raw message