geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (GEODE-3055) data mismatch caused by rebalance. waitUntilFlashed return false
Date Tue, 01 Aug 2017 21:34:00 GMT


ASF GitHub Bot commented on GEODE-3055:

Github user gesterzhou commented on the issue:
    The forceRemovePrimary was useless and it can be removed because it always use "false".
    But when I added the logic to remove the leader region bucket (when the shadow bucket
failed to initialize), I have to call the removeBucket(xxx, forceRemovePrimary=true) by myself.
    When removing leader bucket in the error handling, I have to skip a few "return false"
exit points, because at this time the leader bucket is not logically ready and not qualified
to be removed unless I force to remove it.
    So I make use of the forceRemovePrimary parameter. Maybe I should change it to better
name, such as forceToRemove, though. 

> data mismatch caused by rebalance. waitUntilFlashed return false
> ----------------------------------------------------------------
>                 Key: GEODE-3055
>                 URL:
>             Project: Geode
>          Issue Type: Bug
>            Reporter: xiaojian zhou
>            Assignee: xiaojian zhou
>              Labels: lucene
> /export/buglogs_bvt/xzhou/lucene/concParRegHAPersist-0601-171739
> lucene/concParRegHAPersist.conf
> A=accessor
> B=dataStore
> accessorHosts=1
> accessorThreadsPerVM=5
> accessorVMsPerHost=1
> dataStoreHosts=6
> dataStoreThreadsPerVM=5
> dataStoreVMsPerHost=1
> numVMsToStop=2
> redundantCopies=0
> no local.conf
> In dataStoregemfire5_7483/system.log, thread tid=0xdf, putAll Object_11066
> 17:22:27.135 tid=0xdf] generated tag {v1; rv13 shadowKey=2939
> 17:22:27.136 _partitionedRegionPARALLELGATEWAYSENDER_QUEUE_1 bucket : null // brq is
not ready yet
> is enqueued to the tempQueue
> 17:22:27.272 tid=0xdf] generated tag {v3; rv15 shadowKey=3278
> 17:22:33.111 Subregion created: /_PR/_BAsyncEventQueueindex#partitionedRegionPARALLELGATEWAYSENDER_QUEUE_1
> vm_3_dataStore3_r02-s28_28143.log:
> 17:22:33.120 Put successfully in the queue shadowKey= 2939
> 17:22:33.156 tid=0x7fe started query
> 17:22:33.176 Peeked shadowKey= 2939
> So the root cause is: the event is still in tempQueue before it's processed, the query
happened. WaitUntilFlush should wait until tempQueue is also flushed.

This message was sent by Atlassian JIRA

View raw message