crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Adler (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-36) Avro Compression settings are ignored by Crunch
Date Wed, 08 Aug 2012 03:45:10 GMT


Joseph Adler commented on CRUNCH-36:

Take a look at

It looks like the compression code is part of the output format, and the hadoop settings are
ignored. I'm not sure that I understand why it works like this, but it's the only way I could
get compressed output.
> Avro Compression settings are ignored by Crunch
> -----------------------------------------------
>                 Key: CRUNCH-36
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>          Components: IO
>    Affects Versions: 0.3.0
>            Reporter: Joseph Adler
>             Fix For: 0.3.0
>         Attachments: CRUNCH-36.patch
> Java Avro Map/reduce jobs will compress the output if the configuration setting mapred.output.compress
is true. Avro supports two codecs (deflate and snappy). Users typically select the code by
calling AvroJob.setOutputCodec and specifying either org.apache.avro.file.DataFileConstants.SNAPPY_CODEC
or org.apache.avro.file.DataFileConstants.DEFLATE_CODEC. Users can specify the deflate level
by setting the configuration setting avro.mapred.deflate.level.
> org.apache.crunch.types.avro.AvroOutuptFormat currently ignores these setting.
> I'll follow up with a patch that fixes this problem.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message