Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9654E200B8E for ; Mon, 26 Sep 2016 23:48:26 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9555A160ACA; Mon, 26 Sep 2016 21:48:26 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D4450160AC8 for ; Mon, 26 Sep 2016 23:48:25 +0200 (CEST) Received: (qmail 10383 invoked by uid 500); 26 Sep 2016 21:48:25 -0000 Mailing-List: contact issues-help@geode.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@geode.incubator.apache.org Delivered-To: mailing list issues@geode.incubator.apache.org Received: (qmail 10374 invoked by uid 99); 26 Sep 2016 21:48:25 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Sep 2016 21:48:25 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 83864C71B6 for ; Mon, 26 Sep 2016 21:48:24 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.344 X-Spam-Level: X-Spam-Status: No, score=-4.344 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.124] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id B6Uk0pdUYcUr for ; Mon, 26 Sep 2016 21:48:22 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 6BD495FE62 for ; Mon, 26 Sep 2016 21:48:21 +0000 (UTC) Received: (qmail 8919 invoked by uid 99); 26 Sep 2016 21:48:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Sep 2016 21:48:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 835C02C002D for ; Mon, 26 Sep 2016 21:48:20 +0000 (UTC) Date: Mon, 26 Sep 2016 21:48:20 +0000 (UTC) From: "ASF subversion and git services (JIRA)" To: issues@geode.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (GEODE-1885) Missing subsctiption event with Offheap partitioned region during bucket rebalance. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 26 Sep 2016 21:48:26 -0000 [ https://issues.apache.org/jira/browse/GEODE-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524264#comment-15524264 ] ASF subversion and git services commented on GEODE-1885: -------------------------------------------------------- Commit 55a65840a4e4d427acaed1182aca869bf92ecae6 in incubator-geode's branch refs/heads/develop from [~dschneider] [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=55a6584 ] GEODE-1885: fix infinite loop The previous fix for GEODE-1885 introduced a hang on off-heap regions. If a concurrent close/destroy of the region happens while other threads are modifying it then the thread doing the modification can get stuck in a hot loop that never terminates. The hot loop is in AbstractRegionMap when it tests the existing region entry it finds to see if it can be modified. If the region entry has a value that says it is removed then the operation spins around and tries again. It expects the thread that marked it as being removed to also remove it from the map. The previous fix for GEODE-1885 can cause a remove to not happen. So this fix does two things: 1. On retry remove the existing removed region entry from the map. 2. putEntryIfAbsent now only releases the current entry if it has an off-heap reference. This prevents an infinite loop that was caused by the current thread who just added a new entry with REMOVE_PHASE1 from releasing it (changing it to REMOVE_PHASE2) because it sees that the region is closed/destroyed. > Missing subsctiption event with Offheap partitioned region during bucket rebalance. > ----------------------------------------------------------------------------------- > > Key: GEODE-1885 > URL: https://issues.apache.org/jira/browse/GEODE-1885 > Project: Geode > Issue Type: Bug > Components: offheap > Reporter: Anilkumar Gingade > Assignee: Darrel Schneider > Fix For: 1.0.0-incubating > > > During transaction operation, if there is concurrent redundant bucket re-balance is in progress, the client can miss a subscription event, if its primary queue is hosted on the node where bucket gets moved from. > Consider, three node cluster N1, N2 and N3. With: > - Client C1 connected to node N2. > - Primary bucket region B1 on N1. And secondary bucket for B1 on N2. > - A Transaction is started on N2, which creates a entry on B1. > - When the TX is committed. At the same time the Bucket B1 on N2 is moved to N3. > - The Tx commit message from N1 is sent to N2. This also includes the subscription message, satisfying the client C1. > - On N2, for offheap region, when bucket is not found locally, the exception response is sent to back to N1 without processing the subscription message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)