Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0435710335 for ; Thu, 2 Jan 2014 17:02:17 +0000 (UTC) Received: (qmail 83181 invoked by uid 500); 2 Jan 2014 17:02:11 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 83150 invoked by uid 500); 2 Jan 2014 17:02:10 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 83030 invoked by uid 99); 2 Jan 2014 17:02:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jan 2014 17:02:07 +0000 Date: Thu, 2 Jan 2014 17:02:06 +0000 (UTC) From: "Jimmy Xiang (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860330#comment-13860330 ] Jimmy Xiang commented on HBASE-8912: ------------------------------------ As I said in my comments for HBASE-7521, there are many synchronization issues in 0.94 AM. So it (AM#0.94) has problems to handle racing properly. We can not avoid those racing. We have to deal with them and make sure internal states won't mess up. In trunk, we have better synchronizations so it (AM#trunk) should be better. [~jmspaggi], could you run your test on 0.96 and let us know if you can reproduce this on trunk? We have IT with CM all the time and could not see such issues on 0.96/trunk. > [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE > ---------------------------------------------------------------------------------- > > Key: HBASE-8912 > URL: https://issues.apache.org/jira/browse/HBASE-8912 > Project: HBase > Issue Type: Bug > Reporter: Enis Soztutar > Priority: Critical > Fix For: 0.94.16 > > Attachments: 8912-0.94-alt2.txt, 8912-0.94.txt, 8912-fix-race.txt, HBase-0.94 #1036 test - testRetrying [Jenkins].html, log.txt, org.apache.hadoop.hbase.catalog.TestMetaReaderEditor-output.txt > > > AM throws this exception which subsequently causes the master to abort: > {code} > java.lang.IllegalStateException: Unexpected state : testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. state=PENDING_OPEN, ts=1372891751912, server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE. > at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879) > at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688) > at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424) > at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399) > at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394) > at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > {code} > This exception trace is from the failing test TestMetaReaderEditor which is failing pretty frequently, but looking at the test code, I think this is not a test-only issue, but affects the main code path. > https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/ -- This message was sent by Atlassian JIRA (v6.1.5#6160)