hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Kennedy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1528) HClient for multiple tables
Date Wed, 27 Jun 2007 15:07:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508568

James Kennedy commented on HADOOP-1528:

I decided to write a new equivalent of HClient using two classes: HConnection and HTable.
These represent a splitting of the original HClient. HConnection takes care of administrative
functions like create/delete/disable/enableTable(), etc, caches all table info and region
connections, and serves out HTables via openTable() or createTable().

HTable is a lighter-weight client that allows scan/update of a single HBase table. It uses
its parent HConnection to initialize any region server proxies, etc.

The HConnection is NOT a singleton. I figured that within a single app, user may need to access
multiple HBase clusters.  So instead i made HConnection maintain a static <Configuration,
HConnection> map with static getters for HConnection by Configuration.  The default configuration
is the one on the classpath but it is possible to get HConnections based on any other Configuration.
 HConnection will statically preserve those connections within the application lifetime and
since its constructor is private, it is not possible to instantiate an unregistered HConnection.

The chief advantage of this HConnection-HTable pattern is that one can have multiple concurrent
transactions on multiple tables that share a single HBase "connection". 

I'll post a patch when i've tested some more. Right now this code presumes Hadoop-1531 patch
is applied and i'm trying to avoid code tangle... it would be great if Hadoop-1521 got applied
soon unless you guys reject it.

> HClient for multiple tables
> ---------------------------
>                 Key: HADOOP-1528
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1528
>             Project: Hadoop
>          Issue Type: Task
>          Components: contrib/hbase
>    Affects Versions: 0.14.0
>            Reporter: James Kennedy
> I have an app that needs to access multiple HBase tables concurrently.  The current HClient
can only have one table open at a time even though it caches region servers of multiple tables
as they are looked up.
> This means that my application layer must open multiple HClients, one per table, perhaps
caching those HClients in a pool to reuse them (and their cached table data) as appropriate.
> or
> Shall I write an HClient patch that makes the HClient  multi-table thread-safe?
> Jim's suggestion is to implement an HClient singleton (call it HClientManager?) that
does the actual caching/resync of root/meta regions.  Individual HClients will still be one
table, one update row at a time but will rely on the singleton for the cached table info.
 We want HClients to be created and disposed as fast as possible with a minimum of meta lookups.
> Jim, what about non-root/meta regions, shouldn't they be cached and refreshed via the
singleton also?  It may still be possible that a region split/resync will occur during on
HClient session so does the HClientManager need to be able to notify the corresponding HClients
in that event?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message