thrift-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bryan newbold (Updated) (JIRA)" <>
Subject [jira] [Updated] (THRIFT-1229) Python fastbinary.c can not handle unicode as generated python code
Date Thu, 01 Dec 2011 05:19:40 GMT


bryan newbold updated THRIFT-1229:

    Attachment: python_fastbinary_utf8.patch

This git-style patch adds an extra arg to the fastbinary methods; see also

As a disclaimer, I don't have much experience writing Python C-API code, and this was not
tested with Python3 at all. Use of global variables or separate functions ('encode_binary_utf8')
may be more appropriate style. 

The binary encode function checks every passed string argument and only does UTF-8 encoding
on Unicode PyObjects, which is arguably poor behavior but fit our use case best. If the utf8strings
flag is set then all read objects are decoded as UTF-8; this could potentially lead to a situation
where a client writes a non-UTF8 byte string with the utf8strings flag set with no error,
but the server (also with the utf8strings flag set) has trouble decoding.

Code generated with the optional utf8strings flag to fastbinary would require the most recent
version of the python libraries to be installed, i'm not sure if that flavor of backwards
incompatibility is an issue. 

The is non-functional; see
for a partial fix. 
> Python fastbinary.c can not handle unicode as generated python code
> -------------------------------------------------------------------
>                 Key: THRIFT-1229
>                 URL:
>             Project: Thrift
>          Issue Type: Bug
>          Components: Python - Compiler, Python - Library
>    Affects Versions: 0.7
>         Environment: mac osx 10.6
>            Reporter: Favo
>         Attachments: python_fastbinary_utf8.patch
> #THRIFT-395 ([r959516|])
fixed python unicode support by adding a parameter to thrift command line for py-generator.
However this will not affect fastbinary.c. A normal generated Read/Write function looks like
below, notice that the function returned before reach unicode handling logic.
> {|borderStyle=solid}
>   def write(self, oprot):
>     if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and self.thrift_spec
is not None and fastbinary is not None:
>       oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, self.thrift_spec)))
>       return
>     if self.ip is not None:
>       oprot.writeFieldBegin('ip', TType.STRING, 6)
>       oprot.writeString(self.ip.encode('utf-8'))
>       oprot.writeFieldEnd()
> {code}
> Any suggestion for this?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message