avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Kleppmann (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AVRO-1499) Ruby 2+ Writes Invalid avro files using the avro gem
Date Thu, 03 Jul 2014 15:18:25 GMT

     [ https://issues.apache.org/jira/browse/AVRO-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Martin Kleppmann updated AVRO-1499:
-----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I see what you're saying; I just fear that by adding String#bytesize (even in a way that's
compatible) could cause other things to break if they are badly written. E.g. a library that
uses {{''.respond_to?(:bytesize)}} to check whether it's running in 1.9 might be misled into
thinking that it's running in 1.9, even though it's actually 1.8. Perhaps that's overly paranoid,
but I'd rather be conservative.

I've committed this change. Thanks for your contribution!

> Ruby 2+ Writes Invalid avro files using the avro gem
> ----------------------------------------------------
>
>                 Key: AVRO-1499
>                 URL: https://issues.apache.org/jira/browse/AVRO-1499
>             Project: Avro
>          Issue Type: Bug
>          Components: ruby
>    Affects Versions: 1.7.5
>            Reporter: Michael Ries
>            Assignee: Martin Kleppmann
>              Labels: ruby
>             Fix For: 1.7.7
>
>         Attachments: AVRO-1499-2.patch, AVRO-1499-3.patch, AVRO-1499.patch
>
>
> The rubygem writes corrupted avro files under ruby 2.0.0 and ruby 2.1.1. It appears to
work correctly under jruby-1.7.10 and ruby 1.9.3.
> Here is a reproducible:
> ```ruby
> require 'avro'
>  
> data = [
>   {"guid"=>"144045de-eb44-dd1b-d9af-6c8b5d41a96e", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119",
"name"=>"My Awesome Bank", "created_at"=>1390617818, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"51e06057-14d2-7527-81fa-b07dba0a263b", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119",
"name"=>"Student Loans R' Us", "created_at"=>1386178342, "updated_at"=>1398180286,
"deleted_at"=>nil},
>   {"guid"=>"b4d1d99f-4351-d0e7-221c-a3fae08716bc", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119",
"name"=>"My Awesome Bank", "created_at"=>1390617026, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"084638fa-a78d-bbdd-e075-7c9c957a9b46", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119",
"name"=>"My Awesome Bank", "created_at"=>1390617138, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"79287c76-4e8f-0a21-7569-a2bcdc2b2f4d", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119",
"name"=>"My Awesome Bank", "created_at"=>1390617135, "updated_at"=>1398180288, "deleted_at"=>nil},
>   {"guid"=>"3bcc26b2-7d3b-6c4d-cb27-4eb1574b3c20", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119",
"name"=>"Cayman Islands Bank", "created_at"=>1386902345, "updated_at"=>1398180288,
"deleted_at"=>nil},
>   {"guid"=>"75e1e56c-7611-4030-d002-afa2af70e5a1", "user_guid"=>"0cd41235-5c14-eae9-00ed-c6eb11dd9119",
"name"=>"My Awesome Bank", "created_at"=>1390617427, "updated_at"=>1398180288, "deleted_at"=>nil},
> ]
>  
> member_schema = <<-SCHEMA
> {"namespace": "md.data_logs",
>  "type": "record",
>  "name": "Member",
>  "fields": [
>      {"name": "guid", "type": "string"},
>      {"name": "user_guid", "type": "string"},
>      {"name": "name", "type": ["string","null"]},
>      {"name": "created_at", "type":"long"},
>      {"name": "updated_at", "type":"long"},
>      {"name": "deleted_at", "type":["long","null"]}
>  ]
> }
> SCHEMA
> filepath = "./members.avro"
> File.unlink(filepath) if File.exists?(filepath)
>  
> Avro::DataFile.open(filepath, "w", member_schema) do |dw|
>   data.each do |entry|
>     dw << entry
>   end
> end
>  
>  
> entries = []
> Avro::DataFile.open(filepath, "r") do |reader|
>   reader.each do |entry|
>     entries << entry
>   end
> end
>  
> puts "Here is the data I wrote into the file:"
> data.each{|e| p e }
> print "\n\n\n\n"
>  
> puts "Here is the data I read from the file:"
> entries.each{|e| p e }
> ```
> Under ruby 2+ it fails with the message "undefined method 'unpack' for nil:NilClass (NoMethodError)".
I have also tested that the rubygem can correctly read avro files written by the java client,
but the java client fails to read files written by the ruby client, so the issue is definitely
in how the rubygem is trying to write the binary file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message