avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Douglas Creager <dcrea...@dcreager.net>
Subject Re: C implementation - wrong number of records
Date Thu, 12 May 2011 14:26:21 GMT
> I have the following C code - https://gist.github.com/967968
> When I ran it on a 100000 records file, it says 100030. (Both C and
> Python implementation count 10000).
> 
> What am I doing wrong?

You found a bug in the C library's file reader code; I've opened up a bug report for it:

https://issues.apache.org/jira/browse/AVRO-819

The problem is that the file reader code isn't propagating errors correctly up through the
call stack; which makes avro_file_reader_read not detect EOF; which makes you loop through
the final block of the file twice.  That's where the extra 30 records in your count comes
from — in the file you're reading, the final block must contain 30 records.

I've got a patch ready for this; I'll test on a couple of platforms and then commit it to
Subversion.

cheers
–doug
Mime
View raw message