Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@locus.apache.org Received: (qmail 69308 invoked from network); 15 Apr 2008 22:53:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Apr 2008 22:53:27 -0000 Received: (qmail 52286 invoked by uid 500); 15 Apr 2008 22:53:28 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 52269 invoked by uid 500); 15 Apr 2008 22:53:28 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 52260 invoked by uid 99); 15 Apr 2008 22:53:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Apr 2008 15:53:28 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Apr 2008 22:52:44 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 18A5F234C0D3 for ; Tue, 15 Apr 2008 15:50:22 -0700 (PDT) Message-ID: <981222157.1208299822099.JavaMail.jira@brutus> Date: Tue, 15 Apr 2008 15:50:22 -0700 (PDT) From: "Jim Kellerman (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-500) Regionserver stuck on exit In-Reply-To: <1442770274.1204942670673.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589270#action_12589270 ] Jim Kellerman commented on HBASE-500: ------------------------------------- In the thread dump, the cacheFlusher thread is not present, so HRegionServer.run has interrupted it. HRegionServer.run is now waiting to enter the synchronized block around logRollerLock which is held by the logRoller thread. This means that HRegionServer.run has interrupted the compactSplit thread (which does not protect itself using compactSplitLock as it should while it is working. The logRoller, HRegionServer$Worker and compactSplit threads are waiting to acquire HLog.cacheFlushLock It is not clear why one of the threads cannot proceed as it appears that there is no thread that could currently be holding that lock. Certainly, interrupting the compactSplit thread while it is working is an error. Creating a patch that will fix this error and see if this lock up occurs after it is committed. > Regionserver stuck on exit > -------------------------- > > Key: HBASE-500 > URL: https://issues.apache.org/jira/browse/HBASE-500 > Project: Hadoop HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.16.0, 0.2.0, 0.1.0 > Reporter: stack > Assignee: Jim Kellerman > Attachments: stacktrace.txt > > > Found in 0.16.0 cluster. We're rolling a log and trying to split it too. Looks like hung up locks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.