hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "wangchao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12619) Native memory leaks in CompressorStream
Date Mon, 07 Dec 2015 15:19:11 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045070#comment-15045070
] 

wangchao commented on HADOOP-12619:
-----------------------------------

The code of hadoop 2.7.1 changes the implement of GzipCodec.createOutputStream as 

{code}
  @Override
  public CompressionOutputStream createOutputStream(OutputStream out) 
    throws IOException {
    if (!ZlibFactory.isNativeZlibLoaded(conf)) {
      return new GzipOutputStream(out);
    }
    return CompressionCodec.Util.
        createOutputStreamWithCodecPool(this, conf, out);
  }

  @Override
  public CompressionOutputStream createOutputStream(OutputStream out, 
                                                    Compressor compressor) 
  throws IOException {
    return (compressor != null) ?
               new CompressorStream(out, compressor,
                                    conf.getInt("io.file.buffer.size", 
                                                4*1024)) :
               createOutputStream(out);
  }

    static CompressionOutputStream createOutputStreamWithCodecPool(
        CompressionCodec codec, Configuration conf, OutputStream out)
        throws IOException {
      Compressor compressor = CodecPool.getCompressor(codec, conf);
      CompressionOutputStream stream = null;
      try {
        stream = codec.createOutputStream(out, compressor);
      } finally {
        if (stream == null) {
          CodecPool.returnCompressor(compressor);
        } else {
          stream.setTrackedCompressor(compressor);
        }
      }
      return stream;
    }
 
{code}

but CompressorStream override the close method and still not return the compressor to pool



> Native memory leaks in CompressorStream
> ---------------------------------------
>
>                 Key: HADOOP-12619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12619
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: wangchao
>
> The constructor of org.apache.hadoop.io.compress.CompressorStream requires an org.apache.hadoop.io.compress.Compressor
 object to compress bytes but it does not invoke the compressor's finish method when close
method are called. This may causes the native memory leaks if the compressor is only used
by this CompressorStream object.
> I found this when set up a flume agent with gzip compression, the native memory grows
slowly and cannot fall back. 
> {code}
>   @Override
>   public CompressionOutputStream createOutputStream(OutputStream out) 
>     throws IOException {
>     return (ZlibFactory.isNativeZlibLoaded(conf)) ?
>                new CompressorStream(out, createCompressor(),
>                                     conf.getInt("io.file.buffer.size", 4*1024)) :
>                new GzipOutputStream(out);
>   }
>   @Override
>   public Compressor createCompressor() {
>     return (ZlibFactory.isNativeZlibLoaded(conf))
>       ? new GzipZlibCompressor(conf)
>       : null;
>   }
> {code}
> The method of CompressorStream is
> {code}
>   @Override
>   public void close() throws IOException {
>     if (!closed) {
>       finish();
>       out.close();
>       closed = true;
>     }
>   }
>   @Override
>   public void finish() throws IOException {
>     if (!compressor.finished()) {
>       compressor.finish();
>       while (!compressor.finished()) {
>         compress();
>       }
>     }
>   }
> {code}
> No one will end the compressor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message