hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10591) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed
Date Fri, 16 May 2014 11:22:32 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998537#comment-13998537
] 

Colin Patrick McCabe commented on HADOOP-10591:
-----------------------------------------------

Thanks, Gopal.  I agree that this is a pre-existing issue, definitely not introduced by HADOOP-10047.
 And, in fact, that JIRA should improve the situation in many cases by eliminating the need
for the {{Decompressor}} to allocate its own direct buffer.

semi-related: One thing that I notice in the constructor for {{ZlibDirectDecompressor}} is
that it invokes the superclass constructor ({{ZlibDecompressor}}) with {{directBufferSize
= 0}}, causing us to call {{allocateDirect}} with a size of 0.  I do wonder what this actually
does... I didn't manage to find any documentation for this case (maybe I missed it?).

> Compression codecs must used pooled direct buffers or deallocate direct buffers when
stream is closed
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-10591
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10591
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Hari Shreedharan
>            Assignee: Colin Patrick McCabe
>
> Currently direct buffers allocated by compression codecs like Gzip (which allocates 2
direct buffers per instance) are not deallocated when the stream is closed. Eventually for
long running processes which create a huge number of files, these direct buffers are left
hanging till a full gc, which may or may not happen in a reasonable amount of time - especially
if the process does not use a whole lot of heap.
> Either these buffers should be pooled or they should be deallocated when the stream is
closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message