hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1528) HClient for multiple tables
Date Wed, 27 Jun 2007 19:02:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508622

Jim Kellerman commented on HADOOP-1528:

See Michael's and my comments on HADOOP-1531

Why not have one HConnection object for each HBase instance?

Since the HConnection object is managing region to server mappings I guess it makes sense
to cache all the server information in the connection object rather than just the root/meta
information as I suggested previously.

My original thinking was that since what you call HTable is associated with a single table,
that it made sense to cache the information for that table here instead of in the connection
object. This way, when you are done with that table, it's cache will go away when the HTable
object goes away.

If you maintain the region to server cache for all the open tables for an HBase instance in
the HConnection, then there should probably a close method on HTable so it can tell the connection
to drop the information for that table.

> HClient for multiple tables
> ---------------------------
>                 Key: HADOOP-1528
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1528
>             Project: Hadoop
>          Issue Type: Task
>          Components: contrib/hbase
>    Affects Versions: 0.14.0
>            Reporter: James Kennedy
> I have an app that needs to access multiple HBase tables concurrently.  The current HClient
can only have one table open at a time even though it caches region servers of multiple tables
as they are looked up.
> This means that my application layer must open multiple HClients, one per table, perhaps
caching those HClients in a pool to reuse them (and their cached table data) as appropriate.
> or
> Shall I write an HClient patch that makes the HClient  multi-table thread-safe?
> Jim's suggestion is to implement an HClient singleton (call it HClientManager?) that
does the actual caching/resync of root/meta regions.  Individual HClients will still be one
table, one update row at a time but will rely on the singleton for the cached table info.
 We want HClients to be created and disposed as fast as possible with a minimum of meta lookups.
> Jim, what about non-root/meta regions, shouldn't they be cached and refreshed via the
singleton also?  It may still be possible that a region split/resync will occur during on
HClient session so does the HClientManager need to be able to notify the corresponding HClients
in that event?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message