hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Latham (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-2027) HConnectionManager.HBASE_INSTANCES leaks TableServers
Date Thu, 03 Dec 2009 15:41:20 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Dave Latham updated HBASE-2027:

    Attachment: 2027-LRU.patch

I spent some more time thinking about this issue, and reading through the background HBASE-1251
which set it up.

The cache of connections as it stands now is never reduced or freed due to the strong reference
each TableServers holds back to the HBaseConfiguration key in the HBASE_INSTANCES map.  However,
with the first patch I provided, if client code keeps a strong reference to an HBaseConfiguration
but not to the connection itself, then the connection information may be freed even if the
HBaseConfiguration object is still around.  This is not desirable either.

Another possibility would be to convert the reference the TableServers holds to the HBaseConfiguration
to a WeakReference instead of converting the HBASE_INSTANCES value to holding a WeakReference
to the TableServers.  However, this also presents problems because then if the client held
a strong reference to the HConnection but not to the HBaseConfiguration (less likely than
the earlier case, I believe) then the configuration reference could be freed and methods that
require it would fail.

I propose another simpler method, to get rid of the WeakHashMap / WeakReferences entirely
and make the HBASE_INSTANCES map a simple LRU cache of the last 10 HBaseConfiguration to HConnection
instances.  This will bound the amount of memory used up by the cache (better than the current
implementation) at the risk of hbase clients who need to repeatedly use more than 10 different
hbase configurations losing cached connections (I believe this to be very rare, but please
enlighten me if I am mistaken.)

I'm attaching a new patch with this solution.  Do you think it would be better to make the
size of this cache a config option instead of hardcoding it?

> HConnectionManager.HBASE_INSTANCES leaks TableServers
> -----------------------------------------------------
>                 Key: HBASE-2027
>                 URL: https://issues.apache.org/jira/browse/HBASE-2027
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Dave Latham
>            Assignee: Dave Latham
>             Fix For: 0.20.3, 0.21.0
>         Attachments: 2027-LRU.patch, 2027.patch
> HConnectionManager.HBASE_INSTANCES is a WeakHashMap from HBaseConfiguration to TableServers.
 However, each TableServers has a strong reference back to the HBaseConfiguration key so they
are never freed.  (See note at http://java.sun.com/javase/6/docs/api/java/util/WeakHashMap.html
: "Implementation note: The value objects in a WeakHashMap are held by ordinary strong references.
Thus care should be taken to ensure that value objects do not strongly refer to their own
keys, either directly or indirectly, since that will prevent the keys from being discarded.")
> Moreover, HBaseConfiguration implements hashCode() but not equals() so identical HBaseConfiguration
objects each get their own TableServers object.
> We had a long running HBase client process that was creating new HTable() objects, each
creating a new HBaseConfiguration() and thus a new TableServers object.  It eventually went
OOM, and gave a heap dump indicating 360 MB of data retained by HBASE_INSTANCES.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message