ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Semen Boikov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-1027) Possible data loss in replicated cache on unstable topology.
Date Mon, 24 Aug 2015 08:17:45 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708944#comment-14708944

Semen Boikov commented on IGNITE-1027:

>From rebalance code I see that SYNC rebalance is broken if multiple nodes start concurrently:
- method 'GridDhtPartitionDemandPool.assign' returns empty assigments if there are pending
- DemandWorkers receive empty assigns, finish loop and complete SyncFuture 
- Ignite exits from start method before rebalance was really finished

Also DemandWorkers can stop rebalance process and complete SyncFuture if topology changed
during rebalancing (see usages of DemandWorker.topologyChanged()).

> Possible data loss in replicated cache on unstable topology.
> ------------------------------------------------------------
>                 Key: IGNITE-1027
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1027
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergi Vladykin
>            Assignee: Semen Boikov
>             Fix For: ignite-1.4
> In test IgniteCacheClientQueryReplicatedNodeRestartSelfTest we have 4 data nodes with
replicated caches and single client-only node, which runs SQL queries against those data nodes.
Background threads restarting data nodes. When we restart 2 of 4 data nodes everything is
fine, when 3 of 4 then eventually query returns inconsistent result and cache size returns
smaller values than expected. Since we use rebalance mode SYNC, such a data loss should not

This message was sent by Atlassian JIRA

View raw message