hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nkeywal (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
Date Tue, 28 Feb 2012 15:13:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218257#comment-13218257
] 

nkeywal commented on HBASE-5399:
--------------------------------

If we really want it, I found 3 options, and tried 2.

1) Adding 'close' to the HMasterInterface
After looking at it, I don't think it's a good option: HMasterInterface is an interface shared
between the client & the server. So adding a close function to it would mean the server
must implement it, while it's a client side function only. I believe that's the reason why
there is already a function 'stopProxy' in the RPCENgine instead of a close function. 

2) Adding the possibility to have a delayed close in RPCENgine
Instead of doing it for HMasterInterface in Connection only, we could do it all proxies and
code this in RPCENgine.
There is already a reference counting in the hbase RPCENgine. So we could add here some code
to allow a delayed close. I don't see why it would not be possible, all the code seems to
be in the hbase package (and not hadoop). This would require smart convention to make it configurable
on a per proxy basis, but it should work.

3) Add an class with a delegation 
So I've got this
{noformat}
public interface SharedMaster extends HMasterInterface, Closeable {}
{noformat}

With this in HConnection

{noformat}
public interface HConnection extends Abortable, Closeable {
  public SharedMaster  getSharedMaster()
}
{noformat}

Then the client writes
{noformat}
SharedMaster  master = connection.getSharedMaster();
try {
   master.move(encodedRegionName, destServerName);
} finally {
   master.close();
}
{noformat}

With a java proxy to manage the delegation for us:
{noformat}
    private static class SharedMasterHandler implements InvocationHandler {
       private HConnectionImplementation connection;
       private HMasterInterface master;
      SharedMasterHandler(HConnectionImplementation connection, HMasterInterface master){
        this.connection = connection;
        this.master = master;
      }

      @Override
      public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if (method.getName().equals("close")){
          connection.releaseSharedMaster((HMasterInterface)master);
          return null;
        } else {
          return method.invoke(master, args);
        }
      }
    }
{noformat}
releaseSharedMaster is private in this solution.

It was not really my first idea, but it's a reasonable way to get to the objective. The reflective
delegation is not fast, obviously, but it doesn't matter here as there is much more expensive
remote call just after...


I'am currently testing it, it seems to work.  
                
> Cut the link between the client and the zookeeper ensemble
> ----------------------------------------------------------
>
>                 Key: HBASE-5399
>                 URL: https://issues.apache.org/jira/browse/HBASE-5399
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.94.0
>         Environment: all
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>         Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch
>
>
> The link is often considered as an issue, for various reasons. One of them being that
there is a limit on the number of connection that ZK can manage. Stack was suggesting as well
to remove the link to master from HConnection.
> There are choices to be made considering the existing API (that we don't want to break).
> The first patches I will submit on hadoop-qa should not be committed: they are here to
show the progress on the direction taken.
> ZooKeeper is used for:
> - public getter, to let the client do whatever he wants, and close ZooKeeper when closing
the connection => we have to deprecate this but keep it.
> - read get master address to create a master => now done with a temporary zookeeper
connection
> - read root location => now done with a temporary zookeeper connection, but questionable.
Used in public function "locateRegion". To be reworked.
> - read cluster id => now done once with a temporary zookeeper connection.
> - check if base done is available => now done once with a zookeeper connection given
as a parameter
> - isTableDisabled/isTableAvailable => public functions, now done with a temporary
zookeeper connection.
>      - Called internally from HBaseAdmin and HTable
> - getCurrentNrHRS(): public function to get the number of region servers and create a
pool of thread => now done with a temporary zookeeper connection
> -
> Master is used for:
> - getMaster public getter, as for ZooKeeper => we have to deprecate this but keep
it.
> - isMasterRunning(): public function, used internally by HMerge & HBaseAdmin
> - getHTableDescriptor*: public functions offering access to the master.  => we could
make them using a temporary master connection as well.
> Main points are:
> - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled
architecture ;-). This can be changed, but requires a lot of modifications in these classes
(likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected
client will always be really slower, because it's a tcp connection, and establishing a tcp
connection is slow.
> - having a link between ZK and all the client seems to make sense for some Use Cases.
However, it won't scale if a TCP connection is required for every client
> - if we move the table descriptor part away from the client, we need to find a new place
for it.
> - we will have the same issue if HBaseAdmin (for both ZK & Master), may be we can
put a timeout on the connection. That would make the whole system less deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message