geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GEODE-1178) Unexpected DistributedSystemDisconnectedException caused by RejectedExecutionException
Date Tue, 12 Apr 2016 17:46:25 GMT

    [ https://issues.apache.org/jira/browse/GEODE-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237627#comment-15237627
] 

ASF subversion and git services commented on GEODE-1178:
--------------------------------------------------------

Commit 39e94bc8beb22b62ed727640bbad3511affc9923 in incubator-geode's branch refs/heads/develop
from [~bschuchardt]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=39e94bc ]

GEODE-1178 Unexpected DistributedSystemDisconnectedException caused by RejectedExecutionException

This has been reported to JGroups.  While they're deciding what to do about
it I have coded a workaround in our StatRecorder class.  StatRecorder sits
in the JGroups stack just above the transport protocol that is throwing this
exception from its down() method.  StatRecorder will now catch the exception
and, after sleeping a short amount of time (10ms) it will retry as long as
the Manager is not shutting down.


> Unexpected DistributedSystemDisconnectedException caused by RejectedExecutionException
> --------------------------------------------------------------------------------------
>
>                 Key: GEODE-1178
>                 URL: https://issues.apache.org/jira/browse/GEODE-1178
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>    Affects Versions: 1.0.0-incubating.M1
>            Reporter: Bruce Schuchardt
>             Fix For: 1.0.0-incubating.M3
>
>
> A test in a private run failed when one of the members of the distributed system got
this exception:
>   com.gemstone.gemfire.distributed.DistributedSystemDisconnectedException: Channel closed,
caused by java.util.concurrent.RejectedExecutionException: Task org.jgroups.protocols.TP$3@3485cdbb
rejected from java.util.concurrent.ThreadPoolExecutor@70067f19[Running, pool size = 4, active
threads = 4, queued tasks = 500, completed tasks = 1717]
>   	at com.gemstone.gemfire.distributed.internal.membership.gms.messenger.JGroupsMessenger.send(JGroupsMessenger.java:673)
>   	at com.gemstone.gemfire.distributed.internal.membership.gms.messenger.JGroupsMessenger.send(JGroupsMessenger.java:589)
>   	at com.gemstone.gemfire.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1929)
>   	at com.gemstone.gemfire.distributed.internal.DistributionChannel.send(DistributionChannel.java:88)
>   	at com.gemstone.gemfire.distributed.internal.DistributionManager.sendOutgoing(DistributionManager.java:3471)
>   	at com.gemstone.gemfire.distributed.internal.DistributionManager.sendMessage(DistributionManager.java:3512)
>   	at com.gemstone.gemfire.distributed.internal.DistributionManager.putOutgoing(DistributionManager.java:1872)
>   	at com.gemstone.gemfire.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:527)
>   	at com.gemstone.gemfire.internal.cache.DistributedRegion.distributeDestroy(DistributedRegion.java:1847)
>   	at com.gemstone.gemfire.internal.cache.DistributedRegion.basicDestroyPart3(DistributedRegion.java:1838)
>   	at com.gemstone.gemfire.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:1571)
>   	at com.gemstone.gemfire.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:7189)
>   	at com.gemstone.gemfire.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:7161)
>   	at com.gemstone.gemfire.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:49)
>   	at com.gemstone.gemfire.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:7126)
>   	at com.gemstone.gemfire.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1806)
>   	at com.gemstone.gemfire.internal.cache.LocalRegion.validatedDestroy(LocalRegion.java:1210)
>   	at com.gemstone.gemfire.internal.cache.DistributedRegion.validatedDestroy(DistributedRegion.java:1072)
>   	at com.gemstone.gemfire.internal.cache.LocalRegion.destroy(LocalRegion.java:1195)
>   	at event.EventTest.destroyObject(EventTest.java:556)
>   	at event.EventTest.destroyObject(EventTest.java:540)
>   	at event.EventTest.doEntryOperations(EventTest.java:316)
>   	at event.EventTest.HydraTask_doEntryOperations(EventTest.java:200)
>   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   	at java.lang.reflect.Method.invoke(Method.java:497)
>   	at hydra.MethExecutor.execute(MethExecutor.java:199)
>   	at hydra.MethExecutor.execute(MethExecutor.java:163)
>   	at hydra.TestTask.execute(TestTask.java:195)
>   	at hydra.RemoteTestModule$1.run(RemoteTestModule.java:216)
>   Caused by: java.util.concurrent.RejectedExecutionException: Task org.jgroups.protocols.TP$3@3485cdbb
rejected from java.util.concurrent.ThreadPoolExecutor@70067f19[Running, pool size = 4, active
threads = 4, queued tasks = 500, completed tasks = 1717]
>   	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   	at org.jgroups.util.ShutdownRejectedExecutionHandler.rejectedExecution(ShutdownRejectedExecutionHandler.java:33)
>   	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   	at org.jgroups.protocols.TP.loopback(TP.java:1522)
>   	at org.jgroups.protocols.TP.down(TP.java:1486)
>   	at org.jgroups.stack.Protocol.down(Protocol.java:438)
>   	at com.gemstone.gemfire.distributed.internal.membership.gms.messenger.StatRecorder.down(StatRecorder.java:84)
>   	at org.jgroups.protocols.pbcast.NAKACK2.down(NAKACK2.java:589)
>   	at org.jgroups.protocols.UNICAST3.down(UNICAST3.java:633)
>   	at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:347)
>   	at org.jgroups.protocols.UFC.handleDownMessage(UFC.java:135)
>   	at org.jgroups.protocols.FlowControl.down(FlowControl.java:328)
>   	at org.jgroups.protocols.FlowControl.sendCreditRequest(FlowControl.java:523)
>   	at org.jgroups.protocols.MFC.handleDownMessage(MFC.java:115)
>   	at org.jgroups.protocols.FlowControl.down(FlowControl.java:328)
>   	at org.jgroups.protocols.FRAG2.down(FRAG2.java:136)
>   	at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1039)
>   	at org.jgroups.JChannel.down(JChannel.java:790)
>   	at org.jgroups.JChannel.send(JChannel.java:426)
>   	at com.gemstone.gemfire.distributed.internal.membership.gms.messenger.JGroupsMessenger.send(JGroupsMessenger.java:647)
>   	... 30 more
> Apparently JGroups has an internal executor that has limited bandwidth and can throw
this exception when a multicast message needs to be looped back.
> I believe that the old JGroups stack had loopback disabled for multicast messages and
I don't think that Geode requires it so I'm a little surprised that this is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message