thrift-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "XB (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (THRIFT-1727) Ruby-1.9: data loss: "binary" fields are re-encoded
Date Tue, 13 Nov 2012 00:13:12 GMT

    [ https://issues.apache.org/jira/browse/THRIFT-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495772#comment-13495772
] 

XB edited comment on THRIFT-1727 at 11/13/12 12:11 AM:
-------------------------------------------------------

You are right, the read-path also needs support as the write-path does.

However, this looks to be not particularly complicated (see https://issues.apache.org/jira/browse/THRIFT-1726
):

{noformat}
diff --git a/lib/rb/lib/thrift/struct_union.rb b/lib/rb/lib/thrift/struct_union.rb
index 4e0afcf..7df859c 100644
--- a/lib/rb/lib/thrift/struct_union.rb
+++ b/lib/rb/lib/thrift/struct_union.rb
@@ -100,6 +100,12 @@ module Thrift
           end
         end
         iprot.read_set_end
+      when Types::STRING
+        if field[:binary]
+          value = Bytes.force_binary_encoding(iprot.read_type(field[:type]))
+        else
+          value = iprot.read_type(field[:type])
+        end
       else
         value = iprot.read_type(field[:type])
       end
{noformat}

                
      was (Author: xb):
    You are right, the read-path also needs support as the write path does.

However, this looks to be not particularly complicated (see https://issues.apache.org/jira/browse/THRIFT-1726
):

{noformat}
diff --git a/lib/rb/lib/thrift/struct_union.rb b/lib/rb/lib/thrift/struct_union.rb
index 4e0afcf..7df859c 100644
--- a/lib/rb/lib/thrift/struct_union.rb
+++ b/lib/rb/lib/thrift/struct_union.rb
@@ -100,6 +100,12 @@ module Thrift
           end
         end
         iprot.read_set_end
+      when Types::STRING
+        if field[:binary]
+          value = Bytes.force_binary_encoding(iprot.read_type(field[:type]))
+        else
+          value = iprot.read_type(field[:type])
+        end
       else
         value = iprot.read_type(field[:type])
       end
{noformat}

                  
> Ruby-1.9: data loss: "binary" fields are re-encoded
> ---------------------------------------------------
>
>                 Key: THRIFT-1727
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1727
>             Project: Thrift
>          Issue Type: Bug
>          Components: Ruby - Library
>    Affects Versions: 0.9
>         Environment: JRuby 1.6.8 using "--1.9" command line parameter.
>            Reporter: XB
>
> When setting a binary field of a Thrift object with some binary data (e.g. a string whose
encoding is "ASCII-8BIT") and then serializing this object, the binary data is re-encoded.
That is, it is encoded as if it were not a sequence of bytes but a sequence of characters,
encoded using the ISO-8859-1 encoding. This assumed ISO-8859-1 sequence of characters is then
converted into UTF-8 (by BinaryProtocol or CompactProtocol). This basically means that all
bytes whose values are between 0x80 (inclusive) and 0x100 (exclusive) are converted into multi-byte
sequences. This leads to data corruption.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message