hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Antonov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4495) CatalogTracker has an identity crisis; needs to be cut-back in scope
Date Mon, 02 Jun 2014 19:37:03 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015783#comment-14015783

Mikhail Antonov commented on HBASE-4495:

bq. I thought the client opened connection to zk to find meta and then closed its zk connection?
MetaRegionTracker subclasses ZooKeeperNodeTracker and hence it does track the znode through
watcher, and this class is using within ConnectionManager. Unless I'm missing something, it's
client<-->zk connection? I would also think that having client subscribed to ZK watchers
isn't good.

bq. This seems a little dangerous? If Masters in cluster are changed, all clients must be
updated rather than just zk? Or are you saying that just as client has address of the zk ensemble,
we'd have the equivalent for the cluster of hbase masters? An address for the hbase massters

Yes, that's what I mean. My thinking is that we will have quorum of multiple active masters,
each one hosting replica of meta (how to make meta splittable across machine is important
consideration, but bit off this topic), and in this picture client shouldn't need to know
ZK ansimble.

 - each client knows about quorum of active masters (like ip:port;ip:port list or so)
 - on initial connection client chooses which master to connect to, and then sticks to it
as long, as master is alive. If this master fails, client fails over to next one in the list
 - meta location isn't need to client, as it's collocated with meta. Strictly speaking, even
if meta is split apart for scalability, client can find out location for required meta region
by sending RPC to his master.

What do you think? I can pick up this one. Aligns with my work anyway and seems worth doing.

> CatalogTracker has an identity crisis; needs to be cut-back in scope
> --------------------------------------------------------------------
>                 Key: HBASE-4495
>                 URL: https://issues.apache.org/jira/browse/HBASE-4495
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.94.0
>            Reporter: stack
> CT needs a good reworking.  I'd suggest its scope be cut way down to only deal in zk
transactions rather than zk and reading meta location in hbase (over an HConnection) and being
a purveyor of HRegionInterfaces on meta and root servers and being an Abortable and a verifier
of catalog locations.  Once this is done, I would suggest it then better belongs over under
the zk package and that the Meta* classes then move to client package.
> Here's some messy notes I added to head of CT class in hbase-3446 where I spent some
time trying to make out what it was CT did.
> {code}
>   // TODO: This class needs a rethink.  The original intent was that it would be
>   // the one-stop-shop for root and meta locations and that it would get this
>   // info from reading and watching zk state.  The class was to be used by
>   // servers when they needed to know of root and meta movement but also by
>   // client-side (inside in HTable) so rather than figure root and meta
>   // locations on fault, the client would instead get notifications out of zk.
>   // 
>   // But this original intent is frustrated by the fact that this class has to
>   // read an hbase table, the -ROOT- table, to figure out the .META. region
>   // location which means we depend on an HConnection.  HConnection will do
>   // retrying but also, it has its own mechanism for finding root and meta
>   // locations (and for 'verifying'; it tries the location and if it fails, does
>   // new lookup, etc.).  So, at least for now, HConnection (or HTable) can't
>   // have a CT since CT needs a HConnection (Even then, do want HT to have a CT?
>   // For HT keep up a session with ZK?  Rather, shouldn't we do like asynchbase
>   // where we'd open a connection to zk, read what we need then let the
>   // connection go?).  The 'fix' is make it so both root and meta addresses
>   // are wholey up in zk -- not in zk (root) -- and in an hbase table (meta).
>   //
>   // But even then, this class does 'verification' of the location and it does
>   // this by making a call over an HConnection (which will do its own root
>   // and meta lookups).  Isn't this verification 'useless' since when we
>   // return, whatever is dependent on the result of this call then needs to
>   // use HConnection; what we have verified may change in meantime (HConnection
>   // uses the CT primitives, the root and meta trackers finding root locations).
>   //
>   // When meta is moved to zk, this class may make more sense.  In the
>   // meantime, it does not cohere.  It should just watch meta and root and
>   // NOT do verification -- let that be out in HConnection since its going to
>   // be done there ultimately anyways.
>   //
>   // This class has spread throughout the codebase.  It needs to be reigned in.
>   // This class should be used server-side only, even if we move meta location
>   // up into zk.  Currently its used over in the client package. Its used in
>   // MetaReader and MetaEditor classes usually just to get the Configuration
>   // its using (It does this indirectly by asking its HConnection for its
>   // Configuration and even then this is just used to get an HConnection out on
>   // the other end). St.Ack 10/23/2011.
>   //
> {code}

This message was sent by Atlassian JIRA

View raw message