avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enns, Steven" <sae...@a9.com>
Subject Re: map/reduce of compressed Avro
Date Mon, 29 Apr 2013 22:53:49 GMT
Out of curiosity, are there any other file formats that provide splittable
gzip compression like Avro object containers?  I can only think of
Sequence Files.

On 4/29/13 3:47 PM, "Scott Carey" <scottcarey@apache.org> wrote:

>Martin said it already, but I will emphasize:
>Avro data files are splittable and can support multiple mappers no matter
>what codec is used for compression.  This is because avro files are block
>based, and only use the compression within the block.  I recommend
>starting with gzip compression, and moving to snappy only if deflate
>compression level '1' is not fast enough.
>For more information on avro data files, see:
>On 4/22/13 11:47 PM, "nir_zamir" <nir.zamir@gmail.com> wrote:
>>Thanks Martin.
>>What will happen if I try to use an indexed LZO-compressed avro file?
>>it work and utilize the index to allow multiple mappers?
>>I think that for Snappy for example, the file is splittable and can use
>>multiple mappers, but I haven't tested it yet - would be glad if anyone
>>any experience with that.
>>View this message in context:
>>Sent from the Avro - Users mailing list archive at Nabble.com.

View raw message