Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@locus.apache.org Received: (qmail 99331 invoked from network); 11 Mar 2008 20:52:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Mar 2008 20:52:20 -0000 Received: (qmail 32757 invoked by uid 500); 11 Mar 2008 20:52:17 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 32742 invoked by uid 500); 11 Mar 2008 20:52:17 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 32733 invoked by uid 99); 11 Mar 2008 20:52:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Mar 2008 13:52:17 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Mar 2008 20:51:37 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 6CE18234C08C for ; Tue, 11 Mar 2008 13:50:49 -0700 (PDT) Message-ID: <586532347.1205268649444.JavaMail.jira@brutus> Date: Tue, 11 Mar 2008 13:50:49 -0700 (PDT) From: "stack (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Created: (HBASE-503) cluster won't shut down MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org cluster won't shut down ----------------------- Key: HBASE-503 URL: https://issues.apache.org/jira/browse/HBASE-503 Project: Hadoop HBase Issue Type: Bug Affects Versions: 0.1.0, 0.2.0, 0.16.0 Reporter: stack Master is stuck trying to shutdown. It gets confused if its not running the shutdown. Scenario is cluster is being monitored by a watcher process. When a server goes down, its restarted. In this environment, all hbase was updated then each server was restarted. The regionservers bounced fine but the master won't go down. Its stuck servicing reports of newly started regionservers to whom it sends a shutdown.... but cluster is of such a size that the master hasn't gone down by the time the regionserver starts again. Here is how the master log looks for one server: {code} 2008-03-11 20:47:08,198 INFO org.apache.hadoop.hbase.HMaster: Cancelling lease for XX.XX.XX.122:60020 2008-03-11 20:47:08,198 INFO org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- lease cancelled 2008-03-11 20:47:08,398 DEBUG org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- cancelling lease 2008-03-11 20:47:16,421 INFO org.apache.hadoop.hbase.HMaster: received start message from: XX.XX.XX.122:60020 2008-03-11 20:47:20,163 DEBUG org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- cancelling lease 2008-03-11 20:47:20,163 INFO org.apache.hadoop.hbase.HMaster: Cancelling lease for XX.XX.XX.122:60020 2008-03-11 20:47:20,163 INFO org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- lease cancelled 2008-03-11 20:47:20,393 DEBUG org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- cancelling lease 2008-03-11 20:47:28,374 INFO org.apache.hadoop.hbase.HMaster: received start message from: XX.XX.XX.122:600 202008-03-11 20:47:32,095 DEBUG org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- cancelling lease 2008-03-11 20:47:32,095 INFO org.apache.hadoop.hbase.HMaster: Cancelling lease for XX.XX.XX.122:60020 2008-03-11 20:47:32,095 INFO org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- lease cancelled 2008-03-11 20:47:32,274 DEBUG org.apache.hadoop.hbase.HMaster: Region server XX.XX.XX.122:60020: MSG_REPORT_EXITING -- cancelling lease {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.