avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Avro Python package slowness
Date Fri, 06 May 2011 17:58:10 GMT
On 05/06/2011 10:34 AM, Miki Tebeka wrote:
> I'm using the avro python package (1.5.0), and it is slow.
> It takes about 1min to process 33K records file. For comparison the
> Java packages process the same file in 1sec.
> 
> Any ideas on how to speed that up?

Does the schema have unions?  Last I checked, python recursively
validates data in order to determine which branch of a union should be
written.  In the worst case (nested unions) this can lead to quadratic
serialization times.  It should be possible to determine the union
branch to write much more efficiently.

It would be great to have some performance benchmarks for Python, as we
do for Java.

Doug

Mime
View raw message