Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3F52B982A for ; Sat, 10 Mar 2012 00:43:24 +0000 (UTC) Received: (qmail 41605 invoked by uid 500); 10 Mar 2012 00:43:23 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 41561 invoked by uid 500); 10 Mar 2012 00:43:23 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 41552 invoked by uid 99); 10 Mar 2012 00:43:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Mar 2012 00:43:23 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Mar 2012 00:43:20 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 87B86168F4 for ; Sat, 10 Mar 2012 00:42:58 +0000 (UTC) Date: Sat, 10 Mar 2012 00:42:58 +0000 (UTC) From: "jiraposter@reviews.apache.org (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1718337999.46434.1331340178557.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <127858397.45350.1326736780610.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226641#comment-13226641 ] jiraposter@reviews.apache.org commented on HBASE-5209: ------------------------------------------------------ bq. On 2012-03-10 00:35:03, Michael Stack wrote: bq. > lgtm I didn't actually submit my updated patch yet in response to Jon's comments. :D I will soon. bq. On 2012-03-10 00:35:03, Michael Stack wrote: bq. > src/main/java/org/apache/hadoop/hbase/ClusterStatus.java, line 227 bq. > bq. > bq. > This method seems a little superfluous I wanted to stay consistent with the other accessor APIs in this class that operated on similar lists. bq. On 2012-03-10 00:35:03, Michael Stack wrote: bq. > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 1396 bq. > bq. > bq. > Should we be writing the master name by doing ServerName#getVersionedBytes and then parseVersionedServerName I can do this. - David ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3892/#review5811 ----------------------------------------------------------- On 2012-02-16 06:30:31, David Wang wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3892/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-02-16 06:30:31) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. ------- bq. bq. Problem: bq. There is no method in the HBase client-facing APIs to determine which of the masters is currently active. This can be especially useful in setups with multiple backup masters. bq. bq. Solution: bq. Augment ClusterStatus to return the currently active master and the list of backup masters. bq. bq. Notes: bq. * I uncovered a race condition in ActiveMasterManager, between when it determines that it did not win the original race to be the active master, and when it reads the ServerName of the active master. If the active master goes down in that time, the read to determine the active master's ServerName will fail ungracefully and the candidate master will abort. The solution incorporated in this patch is to check to see if the read of the ServerName succeeded before trying to use it. bq. * I fixed some minor formatting issues while going through the code. I can take these changes out if it is considered improper to commit such non-related changes with the main changes. bq. bq. bq. This addresses bug HBASE-5209. bq. https://issues.apache.org/jira/browse/HBASE-5209 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/ClusterStatus.java b849429 bq. src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 2f60b23 bq. src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9d21903 bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java f6f3f71 bq. src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 111f76e bq. src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 3e3d131 bq. src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 16e4744 bq. src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java bc98fb0 bq. bq. Diff: https://reviews.apache.org/r/3892/diff bq. bq. bq. Testing bq. ------- bq. bq. * Ran mvn -P localTests test multiple times - no new tests fail bq. * Ran mvn -P localTests -Dtest=TestActiveMasterManager test multiple runs - no failures bq. * Ran mvn -P localTests -Dtest=TestMasterFailover test multiple runs - no failures bq. * Started active and multiple backup masters, then killed active master, then brought it back up (will now be a backup master) bq. * Did the following before and after killing bq. * hbase hbck -details - checked output to see that active and backup masters are reported properly bq. * zk_dump - checked that active and backup masters are reported properly bq. * Started cluster with no backup masters to make sure change operates correctly that way bq. * Tested build with this diff vs. build without this diff, in all combinations of client and server bq. * Verified that new client can run against old servers without incident and with the defaults applied. bq. * Note that old clients get an error when running against new servers, because the old readFields() code in ClusterStatus does not handle exceptions of any kind. This is not solvable, at least in the scope of this change. bq. bq. 12/02/15 15:15:38 INFO zookeeper.ClientCnxn: Session establishment complete on server haus02.sf.cloudera.com/172.29.5.33:30181, sessionid = 0x135834c75e20008, negotiated timeout = 5000 bq. 12/02/15 15:15:39 ERROR io.HbaseObjectWritable: Error in readFields bq. A record version mismatch occured. Expecting v2, found v3 bq. at org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46) bq. at org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:247) bq. at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:583) bq. at org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:297) bq. bq. * Ran dev-support/test-patch.sh - no new issues fail: bq. bq. -1 overall. bq. bq. +1 @author. The patch does not contain any @author tags. bq. bq. +1 tests included. The patch appears to include 7 new or modified tests. bq. bq. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. bq. bq. +1 javac. The applied patch does not increase the total number of javac compiler warnings. bq. bq. +1 findbugs. The patch does not introduce any new Findbugs (version ) warnings. bq. bq. +1 release audit. The applied patch does not increase the total number of release audit warnings. bq. bq. bq. Thanks, bq. bq. David bq. bq. > HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup > ------------------------------------------------------------------------------------------------------------------------ > > Key: HBASE-5209 > URL: https://issues.apache.org/jira/browse/HBASE-5209 > Project: HBase > Issue Type: Improvement > Components: master > Affects Versions: 0.90.5, 0.92.0, 0.94.0 > Reporter: Aditya Acharya > Assignee: David S. Wang > Fix For: 0.92.1, 0.94.0 > > Attachments: 5209.addendum, HBASE_5209_v5.diff > > > I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira