geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruce Schuchardt (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (GEODE-950) split brain in wanAdminLocatorsPeerHAP2P
Date Tue, 09 Feb 2016 22:21:18 GMT

     [ https://issues.apache.org/jira/browse/GEODE-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bruce Schuchardt resolved GEODE-950.
------------------------------------
    Resolution: Fixed

This ticket is inappropriate for Apache JIRA as it uses an integration test that is not in
the repo

> split brain in wanAdminLocatorsPeerHAP2P
> ----------------------------------------
>
>                 Key: GEODE-950
>                 URL: https://issues.apache.org/jira/browse/GEODE-950
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce Schuchardt
>
> This test starts locators simultaneously and both are configured to know about the other.
 In the run below two locators created their own membership views, forming a split-brain at
start up time instead of forming a single distributed system.
> Host name: w2-2013-lin-12
> OS name: Linux
> Architecture: amd64
> OS version: 3.10.0-229.el7.x86_64
> Java version: 1.8.0_66
> Java vm name: Java HotSpot(TM) 64-Bit Server VM
> Java vendor: Oracle Corporation
> Java home: /export/gcm/where/jdk/1.8.0_66/x86_64.linux/jre
>   #####################################################
>   
>   GemFire Version 9.0.0-SNAPSHOT
>   Source Date: 2016-02-03 16:09:18 -0800
>   Source Revision: 3f7070f117dbd8f2e5fb436b6aed3469e9fca673
>   Source Repository: develop
>   
>   Build Id: bruces 020416
>   Build Date: 2016-02-04 16:02:44 -0800
>   Build Version: 9.0.0-SNAPSHOT bruces 020416 2016-02-04 16:02:44 -0800 javac 1.8.0_66
>   Build JDK: Java 1.8.0_66
>   Build Platform: Linux 2.6.32-122.el6.x86_64 amd64
>   
>   #####################################################
> Test was run from /export/frodo2/users/bruce/devel/gfasf/closed/gemfire-test/build/resources/test/newWan/discovery/newWanDiscovery.bt
> Test:
> parReg/newWan/parallel/discovery/wanAdminLocatorsPeerHAP2P.conf
>    locatorHostsPerSite=4
>    locatorThreadsPerVM=1
>    locatorVMsPerHost=1
>    maxOps=300
>    peerHostsPerSite=2
>    peerMem=256m
>    peerThreadsPerVM=10
>    peerVMsPerHost=2
>    redundantCopies=1
>    resultWaitSec=600
>    wanSites=3
> Run with local.conf:
> hydra.HostPrms-hostNames = w2-2013-lin-12 w1-gst-dev03;
> //randomSeed extracted from test:
> hydra.Prms-randomSeed=1454836695339;
> *** Test failed with this error:
> CLIENT vm_17_thr_64_peer_2_1_w1-gst-dev03_3365
> INITTASK[2] newWan.WANTest.HydraTask_initPeerTask
> HANG a client exceeded max result wait sec: 600
> *** Last client logging by hung thread
> [info 2016/02/07 01:30:48.650 PST <vm_17_thr_64_peer_2_1_w1-gst-dev03_3365> tid=0x1e]
Configured disk store factory: com.gemstone.gemfire.internal.cache.DiskStoreFactoryImpl@16cf1ca8
> *** Test declared hung 595996 ms after last client logging
> [severe 2016/02/07 01:40:44.646 PST <vm_17_thr_68_peer_2_1_w1-gst-dev03_2152 Dynamic
Client VM Stopper> tid=0x274] Result for vm_17_thr_64_peer_2_1_w1-gst-dev03_3365: INITTASK[2]
newWan.WANTest.HydraTask_initPeerTask: HANG a client exceeded max result wait sec: 600
> *** Hung thread
> "vm_17_thr_64_peer_2_1_w1-gst-dev03_3365" #30 daemon prio=5 os_prio=0 tid=0x00007f0ca0026000
nid=0xdd3 waiting on condition [0x00007f0cafffd000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00000000f7429b60> (a java.util.concurrent.CountDownLatch$Sync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> 	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> 	at com.gemstone.gemfire.internal.cache.BucketPersistenceAdvisor.waitForPrimaryPersistentRecovery(BucketPersistenceAdvisor.java:363)
> 	at com.gemstone.gemfire.internal.cache.ProxyBucketRegion.waitForPrimaryPersistentRecovery(ProxyBucketRegion.java:633)
> 	at com.gemstone.gemfire.internal.cache.PRHARedundancyProvider.recoverPersistentBuckets(PRHARedundancyProvider.java:1821)
> 	at com.gemstone.gemfire.internal.cache.PartitionedRegion.initPRInternals(PartitionedRegion.java:1073)
> 	- locked <0x00000000f567aa10> (a com.gemstone.gemfire.internal.cache.PartitionedRegion)
> 	at com.gemstone.gemfire.internal.cache.PartitionedRegion.initialize(PartitionedRegion.java:1193)
> 	at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3171)
> 	at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:3063)
> 	at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:3052)
> 	at hydra.RegionHelper.createRegion(RegionHelper.java:129)
> 	- locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
> 	at hydra.RegionHelper.createRegion(RegionHelper.java:93)
> 	- locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
> 	at hydra.RegionHelper.createRegion(RegionHelper.java:80)
> 	- locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
> 	at newWan.WANTest.initDatastoreRegion(WANTest.java:439)
> 	at newWan.WANTest.HydraTask_initPeerTask(WANTest.java:797)
> 	- locked <0x00000000f58842e8> (a java.lang.Class for newWan.WANTest)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:497)
> 	at hydra.MethExecutor.execute(MethExecutor.java:198)
> 	at hydra.MethExecutor.execute(MethExecutor.java:162)
> 	at hydra.TestTask.execute(TestTask.java:195)
> 	at hydra.RemoteTestModule$1.run(RemoteTestModule.java:216)
> Stack for hung thread vm_17_thr_64_peer_2_1_w1-gst-dev03_3365 was found 3 times and was
unchanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message