hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-7460) Cleanup client connection layers
Date Sat, 29 Dec 2012 17:38:12 GMT
Gary Helmling created HBASE-7460:
------------------------------------

             Summary: Cleanup client connection layers
                 Key: HBASE-7460
                 URL: https://issues.apache.org/jira/browse/HBASE-7460
             Project: HBase
          Issue Type: Improvement
          Components: Client, IPC/RPC
            Reporter: Gary Helmling


This issue originated from a discussion over in HBASE-7442.  We currently have a broken abstraction
with {{HBaseClient}}, where it is bound to a single {{Configuration}} instance at time of
construction, but then reused for all connections to all clusters.  This is combined with
multiple, overlapping layers of connection caching.

Going through this code, it seems like we have a lot of mismatch between the higher layers
and the lower layers, with too much abstraction in between. At the lower layers, most of the
{{ClientCache}} stuff seems completely unused. We currently effectively have an {{HBaseClient}}
singleton (for {{SecureClient}} as well in 0.92/0.94) in the client code, as I don't see anything
that calls the constructor or {{RpcEngine.getProxy()}} versions with a non-default socket
factory. So a lot of the code around this seems like built up waste.

The fact that a single Configuration is fixed in the {{HBaseClient}} seems like a broken abstraction
as it currently stands. In addition to cluster ID, other configuration parameters (max retries,
retry sleep) are fixed at time of construction. The more I look at the code, the more it looks
like the {{ClientCache}} and sharing the {{HBaseClient}} instance is an unnecessary complication.
Why cache the {{HBaseClient}} instances at all? In {{HConnectionManager}}, we already have
a mapping from {{Configuration}} to {{HConnection}}. It seems to me like each {{HConnection(Implementation)}}
instance should have it's own {{HBaseClient}} instance, doing away with the {{ClientCache}}
mapping. This would keep each {{HBaseClient}} associated with a single cluster/configuration
and fix the current breakage from reusing the same {{HBaseClient}} against different clusters.

We need a refactoring of some of the interactions of {{HConnection(Implementation)}}, {{HBaseRPC/RpcEngine}},
and {{HBaseClient}}. Off hand, we might want to expose a separate {{RpcEngine.getClient()}}
method that returns a new {{RpcClient}} interface (implemented by {{HBaseClient}}) and move
the {{RpcEngine.getProxy()}}/{{stopProxy()}} implementations into the client. So all proxy
invocations can go through the same client, without requiring the static client cache. I haven't
fully thought this through, so I could be missing other important aspects. But that approach
at least seems like a step in the right direction for fixing the client abstractions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message