Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 308CB905D for ; Sat, 26 May 2012 17:41:23 +0000 (UTC) Received: (qmail 48078 invoked by uid 500); 26 May 2012 17:41:23 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 48045 invoked by uid 500); 26 May 2012 17:41:23 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 48036 invoked by uid 99); 26 May 2012 17:41:23 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 May 2012 17:41:23 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 2231A141822 for ; Sat, 26 May 2012 17:41:23 +0000 (UTC) Date: Sat, 26 May 2012 17:41:23 +0000 (UTC) From: "ramkrishna.s.vasudevan (JIRA)" To: issues@hbase.apache.org Message-ID: <1784205185.6107.1338054083142.JavaMail.jiratomcat@issues-vm> In-Reply-To: <408480689.13933.1337347206664.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284026#comment-13284026 ] ramkrishna.s.vasudevan commented on HBASE-6046: ----------------------------------------------- When the master is retrying to come up in {code} private boolean tryRecoveringExpiredZKSession() throws InterruptedException, IOException, KeeperException, ExecutionException { {code} We tend to just initialize the zk trackers, assign root and meta and finally assign any regions in transition. But in this time if an RS has gone down we totally miss those callback and the logs are never splitted. Also as the AM is reinitialized we always treat as a clean cluster start. > Master retry on ZK session expiry causes inconsistent region assignments. > ------------------------------------------------------------------------- > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.92.1, 0.94.0 > Reporter: Gopinathan A > Assignee: ramkrishna.s.vasudevan > Fix For: 0.92.2, 0.94.1 > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or backup comes up) all the node created in the ZK will now be tried to reassign to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira