avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From felix gao <gre1...@gmail.com>
Subject Avro speed comparison with raw logs
Date Wed, 02 Mar 2011 05:05:37 GMT
Hello groups,

I am running some comparison tests on a data set that I converted to avro
with deflator set to level 6. The original logs consists of 2880
uncompressed http access logs with a total size of 1.4TB. The Compressed
avro log is about 2/3 of the size.  However, when I ran the same pig job on
the raw logs, it is blazing fast during the initial map phase. Finished in
under 40 min. When I ran the same pig job with avro files, the initial map
phase took 8 minutes to only finish 10%.  I am wondering is there any way to
figure out what is slowing down the map?



View raw message