avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Nanda <gaurav...@gmail.com>
Subject Re: Writing in Cpp and reading in Python
Date Tue, 15 May 2012 18:34:09 GMT
Also I was wrong in stating that

"Further I try to decode that string in C++, it works fine, but fails in python"

Even in some sample C++, if I try to decode the string returned from
"encode" function, it fails a lot of times.
So, the question should be what is the ideal way to read the encoded
data and return it.

Thanks,
Gaurav Nanda

On Tue, May 15, 2012 at 11:58 PM, Gaurav Nanda <gaurav324@gmail.com> wrote:
> Python indeed has reached end of file as it was expecting
> "\x08\x00\x00\x00\x00\x00\x00\x00\x00" but received only "\x08", so it
> complains.
>
> I conjecture the Cpp snippet I have shared is buggy somewhere. I think
> I am not extracting the entire encoded string.
> What is the suggestive way to pass encoded data from Cpp over the wire?
>
> On Tue, May 15, 2012 at 11:37 PM, Miki Tebeka <miki.tebeka@gmail.com> wrote:
>> Can you place an example avro file somewhere so I can take a look?
>>
>> From the error, it looks like Python has reached end of file and
>> read(1) returned an empty string.
>>
>>
>> On Mon, May 14, 2012 at 2:11 PM, Gaurav Nanda <gaurav324@gmail.com> wrote:
>>> Hi,
>>>
>>> I am using following schema to write in C++ and reading in python.
>>>
>>> {
>>> "type": "record",
>>> "name": "jok_obj",
>>> "fields" : [
>>>            {"name" : "val", "type": ["null", "boolean", "long", "int",
>>>                                      "double", "float", "string",
>>>
>>>                                      {"name" : "date", "type"
: "record",
>>>                                       "fields" : [
>>>                                                   {"name"
: "value",
>>> "type" : "int"}
>>>                                                  ]
>>>                                      },
>>>
>>>                                      {"name" : "datetime",
"type" : "record",
>>>                                       "fields" : [
>>>                                                   {"name"
: "date",
>>> "type" : "int"},
>>>                                                   {"name"
: "tics",
>>> "type" : "int"}
>>>                                                  ]
>>>                                      },
>>>
>>>                                      {"name" : "timestamp",
"type" : "record",
>>>                                       "fields" : [
>>>                                                   {"name"
: "sec",
>>>  "type" : "long"},
>>>                                                   {"name"
:
>>> "microsec", "type" : "long"}
>>>                                                  ]
>>>                                      },
>>>
>>>                                      {"type" : "map",   "values"
: "jok_obj"},
>>>                                      {"type" : "array", "items"
: "jok_obj"}
>>>                                      ]
>>>            }
>>>        ]
>>> }
>>>
>>> I encode C++ object to a memoryInputStream and read it using
>>> StreamReader and convert it ultimately to std::string. Further I try
>>> to decode that string in C++, it works fine, but fails in python.
>>> =====
>>>    std::string AvroObj::encode()
>>>    {
>>>        std::auto_ptr<avro::OutputStream> out = avro::memoryOutputStream();
>>>        avro::EncoderPtr e = avro::binaryEncoder();
>>>        e->init(*out);
>>>        avro::encode(*e, obj);
>>>
>>>        std::auto_ptr<avro::InputStream> in = avro::memoryInputStream(*out);
>>>        avro::StreamReader* reader = new avro::StreamReader(*in);
>>>
>>>        std::stringstream ss;
>>>        while(reader->hasMore()) {
>>>            ss << reader->read();
>>>        }
>>>
>>>        return ss.str();
>>>    }
>>>
>>> =====
>>>
>>> I am trying to encode {"val" : 0.0}, which in encoded form results to
>>> "\x08". But when I send this to python it fails saying:
>>>
>>> ==============================================
>>> ...
>>> File "/u/nanda/jok/lib/python/*****/jok/rpc.py", line 451, in to_avro
>>>    record = dr.read(decoder)
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 445, in read
>>>    return self.read_data(self.writers_schema, self.readers_schema, decoder)
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 490, in read_data
>>>    return self.read_record(writers_schema, readers_schema, decoder)
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 690, in
>>> read_record
>>>    field_val = self.read_data(field.type, readers_field.type, decoder)
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 488, in read_data
>>>    return self.read_union(writers_schema, readers_schema, decoder)
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 654, in read_union
>>>    return self.read_data(selected_writers_schema, readers_schema, decoder)
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 458, in read_data
>>>    return self.read_data(writers_schema, s, decoder)
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 476, in read_data
>>>    return decoder.read_double()
>>>  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 218, in
>>> read_double
>>>    ((ord(self.read(1)) & 0xffL) << 48) |
>>> TypeError: ord() expected a character, but string of length 0 found
>>> ================================================
>>>
>>> While digging in more I found that python encodes {"val" : 0.0"} as
>>> "\x08\x00\x00\x00\x00\x00\x00\x00\x00". Anything string shorter that
>>> this gives above error.
>>>
>>> Could you please suggest?
>>>
>>> Thanks,
>>> Gaurav Nanda

Mime
View raw message