ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Gerus (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (IGNITE-8676) Possible data loss after stoping/starting several nodes at the same time
Date Tue, 26 Jun 2018 08:48:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander Gerus updated IGNITE-8676:
------------------------------------
    Comment: was deleted

(was: Assigned on Stan as solution for the issue is known and should be merged to affected
2.4 master)

> Possible data loss after stoping/starting several nodes at the same time
> ------------------------------------------------------------------------
>
>                 Key: IGNITE-8676
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8676
>             Project: Ignite
>          Issue Type: Bug
>          Components: persistence
>    Affects Versions: 2.4
>            Reporter: Andrey Aleksandrov
>            Assignee: Stanislav Lukyanov
>            Priority: Critical
>             Fix For: 2.6
>
>         Attachments: DataLossTest.zip, Ignite8676Test.java, image-2018-06-01-12-34-54-320.png,
image-2018-06-01-13-12-47-218.png, image-2018-06-01-13-15-17-437.png
>
>
> Steps to reproduce:
> 1)Start 3 data (DN1, DN2, DN3) nodes with the configuration that contains the cache with
node filter for only these three nodes and 1 backup. (see configuration from attachment)
>  2)Activate the cluster. Now you should have 3 nodes in BLT
>  3)Start new server node (SN). Now you should have 3 nodes in BLT and 1 node not in
the baseline.
>  4)Using some node load about 10000 (or more) entities into the cache.
>  5)Start that number of primary partitions equals to backup partitions.
> !image-2018-06-01-12-34-54-320.png!
>  6)Now stop DN3 and SN. After that start them at the same time.
>  7)When DN3 and SN will be online, check that number of primary partitions (PN) equals to
backup partitions (BN).
> 7.1)In a case if PN == BN => go to step 6)
>  7.2)In a case if PN != BN => go to step 8)
>  
> !image-2018-06-01-13-12-47-218.png!
> 8)Deactivate the cluster with control.sh.
>  9)Activate the cluster with control.sh.
> Not you should see the data loss.
> !image-2018-06-01-13-15-17-437.png!
> Notes:
>  1)Stops/Starts should be done at the same time
>  2)Consistent Ids for nodes should be constant.
> Not you should see the data loss.
> Also, I provide the reproducer that often possible to reproduce this issue (not always). 
Free the working directory and restart reproducer in case if there is no data loss in this
iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message