geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Smith (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (GEODE-7012) Distributed deadlock with StartupMessages if executor pools get full
Date Mon, 05 Aug 2019 17:57:00 GMT

     [ https://issues.apache.org/jira/browse/GEODE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dan Smith resolved GEODE-7012.
------------------------------
       Resolution: Fixed
    Fix Version/s: 1.10.0

> Distributed deadlock with StartupMessages if executor pools get full
> --------------------------------------------------------------------
>
>                 Key: GEODE-7012
>                 URL: https://issues.apache.org/jira/browse/GEODE-7012
>             Project: Geode
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Dan Smith
>            Assignee: Ernest Burghardt
>            Priority: Major
>             Fix For: 1.10.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> We hit a distributed deadlock in one of our tests where two members are hung sending
startup messages to each other. 
> It turns out that until a member gets a response to a StartupMessage, it is in a state
where it blocks all outgoing messages. At the same time, the member is receiving an attempting
to respond to other messages, but those responses get blocked. If too many messages come in
before the StartupResponseMessage, this ends up filling up the ClusterDistributionManager.highPriorityPool.
> If two members are trying to start up at the same time, and they both fill up the highPriorityPool,
they both will fail to process each other's StartupMessage, because that message is executed
in the same pool.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message