geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Barchak (JIRA)" <j...@apache.org>
Subject [jira] [Created] (GEODE-4051) Two server jvms crashed at same time and caused some primary and redundant buckets to be cleared. Causing some buckets to get locked and not able to recover also after bouncing all servers
Date Tue, 05 Dec 2017 16:37:00 GMT
Igor Barchak created GEODE-4051:
-----------------------------------

             Summary: Two server jvms crashed at same time and caused some primary and redundant
buckets to be cleared. Causing some buckets to get locked and not able to recover also after
bouncing all servers
                 Key: GEODE-4051
                 URL: https://issues.apache.org/jira/browse/GEODE-4051
             Project: Geode
          Issue Type: Bug
          Components: core
            Reporter: Igor Barchak
             Fix For: 1.2.0


"Pooled Waiting Message Processor 5" tid=0x162
    java.lang.Thread.State: TIMED_WAITING
        at sun.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.CountDownLatch$Sync@1993a5
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
        at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:64)
        at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:715)
        at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:644)
        at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:624)
        at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:519)
        at org.apache.geode.internal.cache.StateFlushOperation.flush(StateFlushOperation.java:243)
        at org.apache.geode.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:349)
        at org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1168)
        at org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1023)
        at org.apache.geode.internal.cache.BucketRegion.initialize(BucketRegion.java:253)
        at org.apache.geode.internal.cache.LocalRegion.createSubregion(LocalRegion.java:962)
        at org.apache.geode.internal.cache.PartitionedRegionDataStore.createBucketRegion(PartitionedRegionDataStore.java:726)
        at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucket(PartitionedRegionDataStore.java:414)
        -  locked org.apache.geode.internal.cache.ProxyBucketRegion@6820a0b6
        at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucketRecursively(PartitionedRegionDataStore.java:272)
        at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabBucket(PartitionedRegionDataStore.java:2815)
        at org.apache.geode.internal.cache.partitioned.ManageBackupBucketMessage.operateOnPartitionedRegion(ManageBackupBucketMessage.java:148)
        at org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:332)





Seems like it was introduced in this fix

https://github.com/apache/geode/commit/3a1062e245b3ded52ea3f6b6de0aff94ce846fa3?diff=split

See StateMarkerMessage.process

The first if condition doesn't have a finally block.
The else has a finally block.

The first if condition didn't have a 'waitFor' operation earlier - it was introduced in this
commit




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message