hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Keller (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3792) TableInputFormat leaks ZK connections
Date Fri, 03 Feb 2012 02:51:53 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199488#comment-13199488

Bryan Keller commented on HBASE-3792:

The latest Cloudera code introduced the new reference counting connection management. There
is a reference counter leak it appears in the HTable constructor, thus you'll see connection
leaks again and my patch doesn't fix it. As a hack for now I force the connection to close
by using reflection, setting the ref counter to 1, and calling close() on the connection.
I do this after calling table.close() in TableInputFormat, TableRecordReader, and TableOutputFormat.
I think I should log another bug, as the leak is not in the map reduce classes.

> TableInputFormat leaks ZK connections
> -------------------------------------
>                 Key: HBASE-3792
>                 URL: https://issues.apache.org/jira/browse/HBASE-3792
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.1
>         Environment: Java 1.6.0_24, Mac OS X 10.6.7
>            Reporter: Bryan Keller
>         Attachments: patch0.90.4, tableinput.patch
> The TableInputFormat creates an HTable using a new Configuration object, and it never
cleans it up. When running a Mapper, the TableInputFormat is instantiated and the ZK connection
is created. While this connection is not explicitly cleaned up, the Mapper process eventually
exits and thus the connection is closed. Ideally the TableRecordReader would close the connection
in its close() method rather than relying on the process to die for connection cleanup. This
is fairly easy to implement by overriding TableRecordReader, and also overriding TableInputFormat
to specify the new record reader.
> The leak occurs when the JobClient is initializing and needs to retrieves the splits.
To get the splits, it instantiates a TableInputFormat. Doing so creates a ZK connection that
is never cleaned up. Unlike the mapper, however, my job client process does not die. Thus
the ZK connections accumulate.
> I was able to fix the problem by writing my own TableInputFormat that does not initialize
the HTable in the getConf() method and does not have an HTable member variable. Rather, it
has a variable for the table name. The HTable is instantiated where needed and then cleaned
up. For example, in the getSplits() method, I create the HTable, then close the connection
once the splits are retrieved. I also create the HTable when creating the record reader, and
I have a record reader that closes the connection when done.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message