avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <scottca...@apache.org>
Subject Re: avro compression using snappy and deflate
Date Mon, 02 Apr 2012 15:31:01 GMT

On 3/30/12 12:08 PM, "Shirahatti, Nikhil" <snikhil@telenav.com> wrote:

>Hello All,
>I think I figured our where I goofed up.
>I was flushing on every record, so basically this was compression per
>record, so it had a meta data with each record. This was adding more data
>to the output when compared to avro.
>So now I have better figures: atleast looks realistic, still need to find
>out of it is map-reduceable.
>Avro= 12G
>Avro+Defalte= 4.5G

Deflate is affected quite a bit by the compression level selected (1 to 9)
in both performance and level of compression.  However, in my experience
anything past level 6 is only very slightly smaller and much slower, while
the difference between levels 1 to 3 is large on both fronts.

>Avro+Snappy = 5.5G
>Have others tried Avro + LZO?

I have not heard of anyone doing this.  LZO is not Apache license
compatible, and there are now several alternatives that are in the same
class of compression algorithm available, including Snappy.

>On 3/30/12 12:54 AM, "Shirahatti, Nikhil" <snikhil@telenav.com> wrote:
>>The original data file (a text file) is 40GB, the avro file is about
>>avro snappy is 13GB!
>>View this message in context:
>>Sent from the Avro - Users mailing list archive at Nabble.com.

View raw message