Return-Path: Delivered-To: apmail-hadoop-hbase-commits-archive@locus.apache.org Received: (qmail 38949 invoked from network); 7 May 2008 22:08:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 May 2008 22:08:49 -0000 Received: (qmail 559 invoked by uid 500); 7 May 2008 22:08:49 -0000 Delivered-To: apmail-hadoop-hbase-commits-archive@hadoop.apache.org Received: (qmail 543 invoked by uid 500); 7 May 2008 22:08:49 -0000 Mailing-List: contact hbase-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-commits@hadoop.apache.org Received: (qmail 532 invoked by uid 99); 7 May 2008 22:08:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 May 2008 15:08:49 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 May 2008 22:08:03 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id A755F238896F; Wed, 7 May 2008 15:08:24 -0700 (PDT) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r654301 - in /hadoop/hbase/trunk: CHANGES.txt src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Date: Wed, 07 May 2008 22:08:23 -0000 To: hbase-commits@hadoop.apache.org From: jimk@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20080507220824.A755F238896F@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: jimk Date: Wed May 7 15:08:21 2008 New Revision: 654301 URL: http://svn.apache.org/viewvc?rev=654301&view=rev Log: HBASE-611 regionserver should do basic health check before reporting alls-well to the master Modified: hadoop/hbase/trunk/CHANGES.txt hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Modified: hadoop/hbase/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/hbase/trunk/CHANGES.txt?rev=654301&r1=654300&r2=654301&view=diff ============================================================================== --- hadoop/hbase/trunk/CHANGES.txt (original) +++ hadoop/hbase/trunk/CHANGES.txt Wed May 7 15:08:21 2008 @@ -46,6 +46,8 @@ HBASE-47 Option to set TTL for columns in hbase (Andrew Purtell via Bryan Duxbury and Stack) HBASE-600 Filters have excessive DEBUG logging + HBASE-611 regionserver should do basic health check before reporting + alls-well to the master Release 0.1.1 - 04/11/2008 Modified: hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java URL: http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java?rev=654301&r1=654300&r2=654301&view=diff ============================================================================== --- hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java (original) +++ hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Wed May 7 15:08:21 2008 @@ -270,7 +270,7 @@ init(reportForDuty(sleeper)); long lastMsg = 0; // Now ask master what it wants us to do and tell it what we have done - for (int tries = 0; !stopRequested.get();) { + for (int tries = 0; !stopRequested.get() && isHealthy();) { long now = System.currentTimeMillis(); if (lastMsg != 0 && (now - lastMsg) >= serverLeaseTimeout) { // It has been way too long since we last reported to the master. @@ -576,7 +576,26 @@ serverInfo.getServerAddress().toString()); } - /* Run some housekeeping tasks before we go into 'hibernation' sleeping at + /* + * Verify that server is healthy + */ + private boolean isHealthy() { + if (!fsOk) { + // File system problem + return false; + } + // Verify that all threads are alive + if (!(leases.isAlive() && compactSplitThread.isAlive() && + cacheFlusher.isAlive() && logRoller.isAlive() && + workerThread.isAlive())) { + // One or more threads are no longer alive - shut down + stop(); + return false; + } + return true; + } + /* + * Run some housekeeping tasks before we go into 'hibernation' sleeping at * the end of the main HRegionServer run loop. */ private void housekeeping() {