Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CD588996C for ; Tue, 6 Mar 2012 08:37:22 +0000 (UTC) Received: (qmail 96487 invoked by uid 500); 6 Mar 2012 08:37:22 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 96438 invoked by uid 500); 6 Mar 2012 08:37:22 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 96410 invoked by uid 99); 6 Mar 2012 08:37:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 08:37:21 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 08:37:18 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C3830ADC8 for ; Tue, 6 Mar 2012 08:36:57 +0000 (UTC) Date: Tue, 6 Mar 2012 08:36:57 +0000 (UTC) From: "nkeywal (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1739246363.26481.1331023017802.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1646044855.37966.1329254159568.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223089#comment-13223089 ] nkeywal commented on HBASE-5399: -------------------------------- TestRegionRebalancing: seems to be a flaky test. Will retry on Hadoop-QA, but I don't reproduce it here. TestRegionRebalancing: With the 7s sleep (i.e. same sleep as before), I don't reproduce it. I will try to understand why this sleep changes the result, but anyway it's not a regression. So this patch is a good candidate for a commit I think. Further enhancement (clusterId, ZK watcher replacement by simple calls) could be put in another JIRA. > Cut the link between the client and the zookeeper ensemble > ---------------------------------------------------------- > > Key: HBASE-5399 > URL: https://issues.apache.org/jira/browse/HBASE-5399 > Project: HBase > Issue Type: Improvement > Components: client > Affects Versions: 0.94.0 > Environment: all > Reporter: nkeywal > Assignee: nkeywal > Priority: Minor > Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch > > > The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. > There are choices to be made considering the existing API (that we don't want to break). > The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. > ZooKeeper is used for: > - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection => we have to deprecate this but keep it. > - read get master address to create a master => now done with a temporary zookeeper connection > - read root location => now done with a temporary zookeeper connection, but questionable. Used in public function "locateRegion". To be reworked. > - read cluster id => now done once with a temporary zookeeper connection. > - check if base done is available => now done once with a zookeeper connection given as a parameter > - isTableDisabled/isTableAvailable => public functions, now done with a temporary zookeeper connection. > - Called internally from HBaseAdmin and HTable > - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread => now done with a temporary zookeeper connection > - > Master is used for: > - getMaster public getter, as for ZooKeeper => we have to deprecate this but keep it. > - isMasterRunning(): public function, used internally by HMerge & HBaseAdmin > - getHTableDescriptor*: public functions offering access to the master. => we could make them using a temporary master connection as well. > Main points are: > - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. > - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client > - if we move the table descriptor part away from the client, we need to find a new place for it. > - we will have the same issue if HBaseAdmin (for both ZK & Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira