Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2B9FB10E76 for ; Fri, 2 Aug 2013 23:29:49 +0000 (UTC) Received: (qmail 16456 invoked by uid 500); 2 Aug 2013 23:29:49 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 16428 invoked by uid 500); 2 Aug 2013 23:29:49 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 16415 invoked by uid 99); 2 Aug 2013 23:29:48 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Aug 2013 23:29:48 +0000 Date: Fri, 2 Aug 2013 23:29:48 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-9095) AssignmentManager's handleRegion should respect the single threaded nature of the processing MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728258#comment-13728258 ] stack commented on HBASE-9095: ------------------------------ Please rerun this patch a few times before committing. We got a zombie above. I've not seen that in a long time. Let me rerun for you now. > AssignmentManager's handleRegion should respect the single threaded nature of the processing > -------------------------------------------------------------------------------------------- > > Key: HBASE-9095 > URL: https://issues.apache.org/jira/browse/HBASE-9095 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Reporter: Devaraj Das > Assignee: Devaraj Das > Fix For: 0.95.2 > > Attachments: 9095-1.txt, 9095-1.txt, 9095-1.txt > > > While debugging a case where a region was getting opened on a RegionServer and then closed soon after (and then never re-opened anywhere thereafter), it seemed like the processing in handleRegion to do with deletion of ZK nodes should be non-asynchronous. This achieves two things: > 1. The synchronous deletion prevents more than one processing on the same event data twice. Assuming that we do get more than one notification (on let's say, region OPENED event), the subsequent processing(s) in handleRegion for the same znode would end up with a zookeeper node not found exception. The return value of the data read would be null and that's already handled. If it is asynchronous, it leads to issues like - master opens a region on a certain RegionServer and soon after it sends that RegionServer a close for the same region, and then the znode is deleted. > 2. The deletion is currently handled in an executor service. This is problematic since by design the events for a given region should be processed in order. By delegating a part of the processing to executor service we are somewhat violating this contract since there is no guarantee of the ordering in the executor service executions... > Thanks to [~jeffreyz] and [~enis] for the discussions on this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira