hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Voss (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-7196) CompressionCodecFactory returns unconfigured GZipCodec if io.compression.codecs is not set
Date Thu, 17 Mar 2011 10:22:29 GMT
CompressionCodecFactory returns unconfigured GZipCodec if io.compression.codecs is not set
------------------------------------------------------------------------------------------

                 Key: HADOOP-7196
                 URL: https://issues.apache.org/jira/browse/HADOOP-7196
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 0.20.2
            Reporter: Peter Voss


In case io.compression.codecs property is not set the GZipCodec is added using this code:
{code:java}
List<Class<? extends CompressionCodec>> codecClasses = getCodecClasses(conf);
if (codecClasses == null) {
  addCodec(new GzipCodec());
  addCodec(new DefaultCodec());      
} else {
  Iterator<Class<? extends CompressionCodec>> itr = codecClasses.iterator();
  while (itr.hasNext()) {
    CompressionCodec codec = ReflectionUtils.newInstance(itr.next(), conf);
    addCodec(codec);     
  }
}
{code}
which leaves GzipCodec unconfigured. If it is set via the {{io.compression.codecs}} property
it gets configured properly using ReflectionUtils.newInstance(..., conf).

I have seen a lot of NPEs on systems that don't have this property set when using a LineRecordReader
(that internally gets the codec from CompressionCodecFactory).

I would suggest to use {{org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec}}
as default value for {{io.compression.codecs}}, instead of having another independent code
path that deals with the case that this property is not set.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message