From issues-return-47547-archive-asf-public=cust-asf.ponee.io@geode.apache.org Tue Feb 19 21:04:04 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 28E2118060E for ; Tue, 19 Feb 2019 22:04:03 +0100 (CET) Received: (qmail 81531 invoked by uid 500); 19 Feb 2019 21:04:03 -0000 Mailing-List: contact issues-help@geode.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@geode.apache.org Delivered-To: mailing list issues@geode.apache.org Received: (qmail 81517 invoked by uid 99); 19 Feb 2019 21:04:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Feb 2019 21:04:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B8EA1C9C85 for ; Tue, 19 Feb 2019 21:04:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id tFIzG4N5KC1X for ; Tue, 19 Feb 2019 21:04:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 7DE7E5F124 for ; Tue, 19 Feb 2019 21:04:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id EAB40E0D27 for ; Tue, 19 Feb 2019 21:04:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 6C5A424513 for ; Tue, 19 Feb 2019 21:04:00 +0000 (UTC) Date: Tue, 19 Feb 2019 21:04:00 +0000 (UTC) From: "ASF subversion and git services (JIRA)" To: issues@geode.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (GEODE-6369) Cache-creation failure after a successful auto-reconnect causes subsequent NPE MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GEODE-6369?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1677= 2325#comment-16772325 ]=20 ASF subversion and git services commented on GEODE-6369: -------------------------------------------------------- Commit 7661eca53df6fa5c71ec21dc3de35eba5cb3e202 in geode's branch refs/head= s/develop from Bruce Schuchardt [ https://gitbox.apache.org/repos/asf?p=3Dgeode.git;h=3D7661eca ] GEODE-6369 Cache-creation failure after a successful auto-reconnect causes = subsequent NPE If an error occurs while rebuilding the cache on auto-reconnect & we can't continue we should throw an exception to any thread waiting for the reconnect to complete. If we're unable to contact the cluster configuration service we do not terminate auto-reconnect attempts. New members are now only allowed to join after view preparation has completed. This will reduce the number of "surprise members" and also ensures that any old member IDs have been removed from the view. We now only attempt to use findCoordinatorFromView multiple times if the view actually changes. Instead we contact locators again to see if there are new registrants. fixing the above exposed other problems in auto-reconnect: * messages were being thrown away by the location service quorum checker during auto-reconnect. some of these were "join" messages that needed to be delivered to the new membership service * registrants weren't being removed from the recovered membership view in the locator. This confused restarting nodes because the recovered membership view has stale info in it that they don't want to use * locator services restart were hanging due to profile interchange being done under synchronization > Cache-creation failure after a successful auto-reconnect causes subsequen= t NPE > -------------------------------------------------------------------------= ----- > > Key: GEODE-6369 > URL: https://issues.apache.org/jira/browse/GEODE-6369 > Project: Geode > Issue Type: Bug > Components: membership > Reporter: Bruce Schuchardt > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > If your server auto-reconnects but there is a problem recreating the cach= e the JGroups channel used for auto-reconnect is closed.=C2=A0 This causes = an NPE when the server makes another auto-reconnect attempt. > The server should instead just log the problem and shut down since future= attempts to recreate the cache will probably run into the same issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)