ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-9004) Failed to reinitialize local partitions
Date Thu, 19 Jul 2018 11:30:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16549154#comment-16549154
] 

ASF GitHub Bot commented on IGNITE-9004:
----------------------------------------

GitHub user EdShangGG opened a pull request:

    https://github.com/apache/ignite/pull/4383

    IGNITE-9004

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gridgain/apache-ignite ignite-9004

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/ignite/pull/4383.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4383
    
----
commit c4ab6c9481d0b019434e65085c08f501a58259c8
Author: EdShangGG <eshangareev@...>
Date:   2018-07-17T19:13:32Z

    IGNITE-9004 WIP

commit 3229732dee496a22606504893681d2cec3f7dc93
Author: EdShangGG <eshangareev@...>
Date:   2018-07-18T11:50:54Z

    IGNITE-9004 WIP

commit 476de884606860599156b15b338c59353dc2a980
Author: EdShangGG <eshangareev@...>
Date:   2018-07-18T16:23:47Z

    IGNITE-9004 WIP

commit 3c1f0503c934dcafa475a21cc9f2859324e5978d
Author: EdShangGG <eshangareev@...>
Date:   2018-07-18T19:00:49Z

    IGNITE-9004 WIP

commit 3cd22b88c8569b9d9bd6efeafc217019df5d99ad
Author: Eduard Shangareev <eshangareev@...>
Date:   2018-07-19T11:27:57Z

    Revert "IGNITE-9004 WIP"
    
    This reverts commit 3c1f050

----


> Failed to reinitialize local partitions
> ---------------------------------------
>
>                 Key: IGNITE-9004
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9004
>             Project: Ignite
>          Issue Type: Test
>            Reporter: Anton Kalashnikov
>            Assignee: Eduard Shangareev
>            Priority: Critical
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.7
>
>
> Reproduced by Activate/Deactivate suit, almost any tests in  IgniteChangeGlobalStateTest
class. for example  IgniteChangeGlobalStateTest#testStopPrimaryAndActivateFromClientNode
> {noformat}
> Failed to reinitialize local partitions (preloading will be stopped): GridDhtPartitionExchangeId
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=ChangeGlobalStateMessage
[id=9093c48a461-165cdacd-8a3b-4072-9f48-e80e1b63fda9, reqId=07393ea5-1c6a-4581-b016-9eb88d6bd978,
initiatingNodeId=8dced5ba-725d-494b-8e8e-ffc76453fecd, activate=true, baselineTopology=BaselineTopology
[id=0, branchingHash=314980173, branchingType='Cluster activation', baselineNodes=[node2,
node0, node1]], forceChangeBaselineTopology=false, timestamp=1531832492029], affTopVer=AffinityTopologyVersion
[topVer=6, minorTopVer=1], super=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=8dced5ba-725d-494b-8e8e-ffc76453fecd,
addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.25.4.132], sockAddrs=[/172.25.4.132:47504, /0:0:0:0:0:0:0:1%lo:47504,
/127.0.0.1:47504], discPort=47504, order=2, intOrder=2, lastExchangeTime=1531832486546, loc=false,
ver=2.7.0#19700101-sha1:00000000, isClient=false], topVer=6, nodeId8=9960f6b9, msg=null, type=DISCOVERY_CUSTOM_EVT,
tstamp=1531832492035]], nodeId=8dced5ba, evt=DISCOVERY_CUSTOM_EVT]
> java.lang.AssertionError: calculatedOffset=3072, allocated=2048, headerSize=1024
> at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:358)
> at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:400)
> at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:384)
> at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:783)
> at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:627)
> at org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:144)
> at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.init(PagesList.java:169)
> at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.<init>(AbstractFreeList.java:371)
> at org.apache.ignite.internal.processors.cache.persistence.metastorage.MetaStorage$FreeListImpl.<init>(MetaStorage.java:484)
> at org.apache.ignite.internal.processors.cache.persistence.metastorage.MetaStorage.init(MetaStorage.java:143)
> at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory(GridCacheDatabaseSharedManager.java:852)
> at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onClusterStateChangeRequest(GridDhtPartitionsExchangeFuture.java:954)
> at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:661)
> at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2484)
> at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2364)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> Given:
> # Activated Node1-1 in grid1.
> # MetaStorage on node1-1 in OffHeap.
> # MetaStorage have not storage on disk yet.
> When:
> # Checkpoint on node1-1 is starting. Start checkpoint marker was written.
> # node2-1 in grid2 is starting.(grid1 and grid2 have same persistence)
> Then:
> # node2-1 found expected checkpoint marker("Found unexpected checkpoint marker") and
initialize FilePageStore for metaStorage by empty page
> # node1-1 finished checkpoint and wrote MetaStorage on disk.
> # After stop grid1 and activate grid2 node2-1 was failed because try read more than one
page.
> Possible solution:
> * We can skip initialization FilePageStore for MetaStorage by empty page during the start
> * We can take a lock for metaStorage that only one node can read or write one MetaStorage
in one moment.
> * We can reinitialize FilePageStore from disk when we activate cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message