avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saptarshi Guha <sg...@mozilla.com>
Subject Python Avro and Integers
Date Mon, 14 Mar 2011 02:35:29 GMT
Hello,

I have checked out Avro 1.6 and am prototyping some ideas with python. 
My (rather lengthy) avsc file can be found here [1].

The following python code does not what I expect

import cStringIO as StringIO
SCHEMA = schema.parse(open("/Users/sguha/tmp/ravro/robjects.avsc").read())
data={'data':int(1)}
output = StringIO.StringIO()
rec_writer = io.DatumWriter(SCHEMA)
rec_writer.write(data,io.BinaryEncoder(output))
output.getvalue()


'\x04\x00\x00\x00\x00\x00\x00\xf0?\x00'


(1) I expected the first byte to be 0x04 (because int is the 2nd entry in the union, and 2
in var.len.zz is 0x04)
  I then expect just a single byte 0x02 corresponding to 1, instead i see 8 bytes. And if
it somehow upcasts to double
why isn't the first byte the v.v.zz of 3 (double in the ?

(2) Replacing int(1) with 1.0 returns the same bytes.

(3) Reading the data always returns a double

rec_writer = io.DatumWriter(SCHEMA)
df_writer = datafile.DataFileWriter(open("/tmp/foo", 'wb'),rec_writer,writers_schema = SCHEMA,codec
= 'deflate')
df_writer.append(data)
df_writer.close()

rec_reader = io.DatumReader()
df_reader = datafile.DataFileReader(open("/tmp/foo"),rec_reader).next()


(4) Arrays of integer 1's

import cStringIO as StringIO
SCHEMA = schema.parse(open("/Users/sguha/tmp/ravro/robjects.avsc").read())
data={'data':[int(1),int(1),int(1)]}
output = StringIO.StringIO()
rec_writer = io.DatumWriter(SCHEMA)
rec_writer.write(data,io.BinaryEncoder(output))
output.getvalue()

'\x06\x06\x02\x02\x02\x02\x02\x02\x00\x00'


0x06 is for the 4 entry(offset 3) in the union.
0x06 is for the length of the array
This should be followed by 3 0x02's then a 0x0 (end of array) followed by 0x0 (for the attributes)
But i see an extra 0x0 for each of the 1's. In fact adding an extra int(1) increases this
array by 2 bytes (
it should be by 1 byte)

I guess my Schema could be wrong ... 

Cheers
Saptarshi





[1] https://gist.github.com/868664
---
Saptarshi Guha | sguha@mozilla.com | skype: saptarshi.guha | irc: joy
If I love you, what business is it of yours? -- Johann van Goethe


Mime
View raw message