geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruce Schuchardt (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (GEODE-6423) availability checks sometimes immediately initiate removal
Date Fri, 22 Feb 2019 22:02:00 GMT

     [ https://issues.apache.org/jira/browse/GEODE-6423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bruce Schuchardt resolved GEODE-6423.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.10.0

> availability checks sometimes immediately initiate removal
> ----------------------------------------------------------
>
>                 Key: GEODE-6423
>                 URL: https://issues.apache.org/jira/browse/GEODE-6423
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce Schuchardt
>            Assignee: Bruce Schuchardt
>            Priority: Major
>             Fix For: 1.10.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the network goes down the JGroupsMessenger service initiates suspect processing when
it tries to send messages.  In 1.8 this seems to initiate immediate removal of the suspect.
> ioexception sending udp message initiates suspicion
> suspect processing initiates a final check
> the final check fails immediately (it's using a timed Socket.connect() which fails immediately)
> the member is declared dead
> {noformat}
> [info 2019/02/13 17:44:59.366 CST perf157-130-167-server1 <Geode Failure Detection
thread 3> tid=0xc2] received suspect message from myself for 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000:
Unable to send messages to this member via JGroups
> [info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection
thread 4> tid=0xc3] Performing final check for suspect member 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000
reason=Unable to send messages to this member via JGroups
> [info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection
thread 5> tid=0xc4] Performing final check for suspect member 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202
reason=Unable to send messages to this member via JGroups
> [info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection
thread 4> tid=0xc3] Failure detection is now watching 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200
> [info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection
thread 5> tid=0xc4] Failure detection is now watching 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000
> [info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection
thread 3> tid=0xc2] received suspect message from myself for 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201:
Unable to send messages to this member via JGroups
> [info 2019/02/13 17:44:59.369 CST perf157-130-167-server1 <Geode Failure Detection
thread 6> tid=0xc5] Performing final check for suspect member 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201
reason=Unable to send messages to this member via JGroups
> [info 2019/02/13 17:44:59.369 CST perf157-130-167-server1 <Geode Failure Detection
thread 6> tid=0xc5] Failure detection is now watching 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200
> [info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection
thread 5> tid=0xc4] Final check failed for member 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202
> [info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection
thread 5> tid=0xc4] Requesting removal of suspect member 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202
> [info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection
thread 4> tid=0xc3] Final check failed for member 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000
> [info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection
thread 4> tid=0xc3] Requesting removal of suspect member 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000
> [info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection
thread 4> tid=0xc3] This member is becoming the membership coordinator with address 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200
> [info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection
thread 6> tid=0xc5] Final check failed for member 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201
> [info 2019/02/13 17:44:59.373 CST perf157-130-167-server1 <Geode Failure Detection
thread 6> tid=0xc5] Requesting removal of suspect member 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201
> [info 2019/02/13 17:44:59.376 CST perf157-130-167-server1 <Geode Failure Detection
thread 4> tid=0xc3] ViewCreator starting on:192.168.130.167(perf157-130-167-server1:225263)<v1>:16200
> [info 2019/02/13 17:44:59.376 CST perf157-130-167-server1 <Geode Membership View Creator>
tid=0xc6] View Creator thread is starting
> [info 2019/02/13 17:44:59.377 CST perf157-130-167-server1 <Geode Membership View Creator>
tid=0xc6] 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000
had a weight of 3
> [info 2019/02/13 17:44:59.377 CST perf157-130-167-server1 <Geode Membership View Creator>
tid=0xc6] 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202 had a weight of
10
> [info 2019/02/13 17:44:59.377 CST perf157-130-167-server1 <Geode Membership View Creator>
tid=0xc6] preparing new view View[192.168.130.167(perf157-130-167-server1:225263)<v1>:16200|10]
members: [192.168.130.167(perf157-130-167-server1:225263)<v1>:16200{lead}, 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201]
crashed: [192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000,
192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202]
> [info 2019/02/13 17:45:03.627 CST perf157-130-167-server1 <unicast receiver,perf157-130-167-62066>
tid=0x21] received suspect message from 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202
for 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000: Unable
to send messages to this member via JGroups
> [info 2019/02/13 17:45:03.718 CST perf157-130-167-server1 <unicast receiver,perf157-130-167-62066>
tid=0x21] Membership received a request to remove 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200
from 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000 reason=Unable
to send messages to this member via JGroups
> [severe 2019/02/13 17:45:03.719 CST perf157-130-167-server1 <unicast receiver,perf157-130-167-62066>
tid=0x21] Membership service failure: Unable to send messages to this member via JGroups
> org.apache.geode.ForcedDisconnectException: Unable to send messages to this member via
JGroups
> {noformat}
>  
> We expect the final check to respect the member-timeout setting.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message