avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@cloudera.com>
Subject Re: Python API - DataFileReader cannot read .avro file created from DataFileWriter
Date Tue, 13 Oct 2015 13:57:16 GMT
can the avro-tools jar read the schema from the datafile?  Can it read
the entries from the datafile using tojson?

On Mon, Oct 5, 2015 at 6:09 AM, Balaji Vijayan
<balaji.k.vijayan@gmail.com> wrote:
> Windows 8.1, Python 2.7, Avro 1.7.7
> Using this avro schema and data in this format I am able to validate the
> data against the schema prior to attempting to write the data to a .avro
> file using the python DataFileWriter. The data writes successfully to a
> .avro file. When I attempt to read the data I either receive a List out of
> Index error or a SchemaResolutionException: Can't access branch index XX for
> union with 12 branches.
> Code:
> #imports
> import avro.schema
> from avro.datafile import DataFileReader, DataFileWriter
> from avro.io import DatumReader, DatumWriter
> from pretty import pprint
> # get data
> data = [list of dictionaries in 2nd link]
> #get schema and writer
> schema = avro.schema.parse(open("mygramschema.avsc").read())
> writer = DataFileWriter(open("mygram.avro", "w"), DatumWriter(), schema)
> #write data
> for vals in data:
>      writer.append(vals)
> writer.close()
> #get reader
> reader = DataFileReader(open("mygram.avro", "r"), DatumReader())
> for data in reader:
>      pprint (data)
> When I receive the list of index error nothing prints and when I receive the
> SchemaResolutionException error some of the data prints but not all. I am
> generating this data on the fly so I've checked a number of different
> unicode encoding issues and had no luck so I don't think that's the issue.
> I'm at a loss for how to go about troubleshooting this since the avro schema
> checks out when I use avro.io.validate; in addtion the jsontofrag jar
> utility tool has provided no additional information for debugging.
> Thanks,
> Balaji


View raw message