Return-Path: X-Original-To: apmail-geode-issues-archive@minotaur.apache.org Delivered-To: apmail-geode-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B321118BBD for ; Wed, 11 Nov 2015 02:02:14 +0000 (UTC) Received: (qmail 92929 invoked by uid 500); 11 Nov 2015 02:02:14 -0000 Delivered-To: apmail-geode-issues-archive@geode.apache.org Received: (qmail 92897 invoked by uid 500); 11 Nov 2015 02:02:14 -0000 Mailing-List: contact issues-help@geode.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@geode.incubator.apache.org Delivered-To: mailing list issues@geode.incubator.apache.org Received: (qmail 92888 invoked by uid 99); 11 Nov 2015 02:02:14 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Nov 2015 02:02:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 03F921A3041 for ; Wed, 11 Nov 2015 02:02:14 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.77 X-Spam-Level: * X-Spam-Status: No, score=1.77 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id D8obAWjkrtQH for ; Wed, 11 Nov 2015 02:02:13 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with SMTP id 2FF6E20BD8 for ; Wed, 11 Nov 2015 02:02:12 +0000 (UTC) Received: (qmail 92618 invoked by uid 99); 11 Nov 2015 02:02:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Nov 2015 02:02:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 2B2FE2C1F60 for ; Wed, 11 Nov 2015 02:02:11 +0000 (UTC) Date: Wed, 11 Nov 2015 02:02:11 +0000 (UTC) From: "Dan Smith (JIRA)" To: issues@geode.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (GEODE-542) Race in FunctionService.onMembers can result in hang during member startup MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GEODE-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith reassigned GEODE-542: ------------------------------- Assignee: Dan Smith > Race in FunctionService.onMembers can result in hang during member startup > -------------------------------------------------------------------------- > > Key: GEODE-542 > URL: https://issues.apache.org/jira/browse/GEODE-542 > Project: Geode > Issue Type: Bug > Reporter: Dan Smith > Assignee: Dan Smith > > I hit this while doing some internal tests of FunctionService. I have a function that calls CacheFactory.getAnyInstance(). I was seeing that occasionally, my function would never see a reply while a member was starting up. > Turning on debug logging, I found this is the logs > {noformat} > [fine 2015/10/28 17:15:41.903 PDT clientgemfire2_gluon_2055 tid=0x37] shutdown caught, abandoning message: A cache has not yet been created. > com.gemstone.gemfire.cache.CacheClosedException: A cache has not yet been created. > at com.gemstone.gemfire.cache.CacheFactory.getAnyInstance(CacheFactory.java:292) > at com.gemstone.gemfire.internal.cache.execute.util.RollbackFunction.execute(RollbackFunction.java:82) > at com.gemstone.gemfire.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:194) > at com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:380) > at com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:451) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:701) > at com.gemstone.gemfire.distributed.internal.DistributionManager$9$1.run(DistributionManager.java:1158) > at java.lang.Thread.run(Thread.java:745) > {noformat} > This seems wrong, because by not replying to the function the caller then can hang. I think this code was intended for use during shutdown, but it also gets hit during startup because members are available to process functions before the cache is created. That in itself is perhaps problematic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)