hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-7460) Cleanup client connection layers
Date Fri, 18 Jan 2013 20:14:12 GMT

     [ https://issues.apache.org/jira/browse/HBASE-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gary Helmling updated HBASE-7460:
---------------------------------

    Attachment: HBASE-7460_2.patch
    
> Cleanup client connection layers
> --------------------------------
>
>                 Key: HBASE-7460
>                 URL: https://issues.apache.org/jira/browse/HBASE-7460
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, IPC/RPC
>            Reporter: Gary Helmling
>         Attachments: HBASE-7460_2.patch
>
>
> This issue originated from a discussion over in HBASE-7442.  We currently have a broken
abstraction with {{HBaseClient}}, where it is bound to a single {{Configuration}} instance
at time of construction, but then reused for all connections to all clusters.  This is combined
with multiple, overlapping layers of connection caching.
> Going through this code, it seems like we have a lot of mismatch between the higher layers
and the lower layers, with too much abstraction in between. At the lower layers, most of the
{{ClientCache}} stuff seems completely unused. We currently effectively have an {{HBaseClient}}
singleton (for {{SecureClient}} as well in 0.92/0.94) in the client code, as I don't see anything
that calls the constructor or {{RpcEngine.getProxy()}} versions with a non-default socket
factory. So a lot of the code around this seems like built up waste.
> The fact that a single Configuration is fixed in the {{HBaseClient}} seems like a broken
abstraction as it currently stands. In addition to cluster ID, other configuration parameters
(max retries, retry sleep) are fixed at time of construction. The more I look at the code,
the more it looks like the {{ClientCache}} and sharing the {{HBaseClient}} instance is an
unnecessary complication. Why cache the {{HBaseClient}} instances at all? In {{HConnectionManager}},
we already have a mapping from {{Configuration}} to {{HConnection}}. It seems to me like each
{{HConnection(Implementation)}} instance should have it's own {{HBaseClient}} instance, doing
away with the {{ClientCache}} mapping. This would keep each {{HBaseClient}} associated with
a single cluster/configuration and fix the current breakage from reusing the same {{HBaseClient}}
against different clusters.
> We need a refactoring of some of the interactions of {{HConnection(Implementation)}},
{{HBaseRPC/RpcEngine}}, and {{HBaseClient}}. Off hand, we might want to expose a separate
{{RpcEngine.getClient()}} method that returns a new {{RpcClient}} interface (implemented by
{{HBaseClient}}) and move the {{RpcEngine.getProxy()}}/{{stopProxy()}} implementations into
the client. So all proxy invocations can go through the same client, without requiring the
static client cache. I haven't fully thought this through, so I could be missing other important
aspects. But that approach at least seems like a step in the right direction for fixing the
client abstractions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message