avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Avro speed comparison with raw logs
Date Fri, 04 Mar 2011 17:25:41 GMT
On 03/01/2011 09:05 PM, felix gao wrote:
> I am running some comparison tests on a data set that I converted to
> avro with deflator set to level 6. The original logs consists of 2880
> uncompressed http access logs with a total size of 1.4TB. The Compressed
> avro log is about 2/3 of the size.  However, when I ran the same pig job
> on the raw logs, it is blazing fast during the initial map phase.
> Finished in under 40 min. When I ran the same pig job with avro files,
> the initial map phase took 8 minutes to only finish 10%.  I am wondering
> is there any way to figure out what is slowing down the map?

What version of Avro are you using?  How are you integrating Avro with Pig?

Also, for speed, you might try level=1 (Deflater.BEST_SPEED).

Doug

Mime
View raw message