ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Semen Boikov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-10218) Detecting lost partitions PME phase triggered twice on coordinator
Date Mon, 12 Nov 2018 19:30:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684285#comment-16684285
] 

Semen Boikov commented on IGNITE-10218:
---------------------------------------

[~antonovsergey93], yes, looks like for coordinator second call in GridDhtPartitionsExchangeFuture#onDone()
can be removed. 

> Detecting lost partitions PME phase triggered twice on coordinator
> ------------------------------------------------------------------
>
>                 Key: IGNITE-10218
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10218
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Antonov
>            Priority: Major
>             Fix For: 2.8
>
>
> scenarion: 1 server left
>  coordinator node seems to detect partition losses twice per exchange
> {noformat}
> [16:54:22,027][INFO][exchange-worker-#66][time] Finished exchange init [topVer=AffinityTopologyVersion
[topVer=13, minorTopVer=0], crd=true]
> [16:54:22,163][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] Skipping
checkpoint (no pages were modified) [checkpointLockWait=0ms, checkpointLockHoldTime=525ms,
reason='timeout']
> [16:54:22,338][INFO][sys-#136][GridDhtPartitionsExchangeFuture] Coordinator received
single message [ver=AffinityTopologyVersion [topVer=13, minorTopVer=0], node=2b69b32f-1bea-4c83-a70d-d7ff8ad7e319,
allReceived=false]
> [16:54:22,401][INFO][sys-#137][GridDhtPartitionsExchangeFuture] Coordinator received
single message [ver=AffinityTopologyVersion [topVer=13, minorTopVer=0], node=933628df-5237-435c-81d3-7d4be20d8cea,
allReceived=false]
> [16:54:22,405][INFO][sys-#73][GridDhtPartitionsExchangeFuture] Coordinator received single
message [ver=AffinityTopologyVersion [topVer=13, minorTopVer=0], node=549935a6-48b0-47cd-a763-13cef4706960,
allReceived=false]
> [16:54:22,413][INFO][sys-#121][GridDhtPartitionsExchangeFuture] Coordinator received
single message [ver=AffinityTopologyVersion [topVer=13, minorTopVer=0], node=fbdde4e1-2422-49af-a0bb-d6797cc723fe,
allReceived=false]
> [16:54:22,722][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Coordinator received
single message [ver=AffinityTopologyVersion [topVer=13, minorTopVer=0], node=d67a748d-9c63-4ede-8dae-064b63dd1586,
allReceived=true]
> [16:54:23,493][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] Skipping
checkpoint (no pages were modified) [checkpointLockWait=0ms, checkpointLockHoldTime=849ms,
reason='timeout']
> [16:54:23,494][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Coordinator received
all messages, try merge [ver=AffinityTopologyVersion [topVer=13, minorTopVer=0]]
> [16:54:23,494][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Exchanges merging performed
in 0 ms.
> [16:54:23,494][INFO][sys-#122][GridDhtPartitionsExchangeFuture] finishExchangeOnCoordinator
[topVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], resVer=AffinityTopologyVersion
[topVer=13, minorTopVer=0]]
> [16:54:24,223][INFO][sys-#122][CacheAffinitySharedManager] Affinity recalculation (on
server left) performed in 729 ms.
> [16:54:24,371][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] Skipping
checkpoint (no pages were modified) [checkpointLockWait=0ms, checkpointLockHoldTime=726ms,
reason='timeout']
> [16:54:25,146][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] Skipping
checkpoint (no pages were modified) [checkpointLockWait=1ms, checkpointLockHoldTime=493ms,
reason='timeout']
> [16:54:26,443][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] Skipping
checkpoint (no pages were modified) [checkpointLockWait=8ms, checkpointLockHoldTime=776ms,
reason='timeout']
> [16:54:26,443][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Affinity changes (coordinator)
applied in 2949 ms.
> [16:54:26,758][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Partitions validation
performed in 307 ms.
> [16:54:27,398][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] Skipping
checkpoint (no pages were modified) [checkpointLockWait=0ms, checkpointLockHoldTime=725ms,
reason='timeout']
> [16:54:27,646][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Detecting lost partitions
performed in 887 ms.
> [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Preparing Full Message
performed in 1138 ms.
> [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Sending Full Message
performed in 0 ms.
> [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Sending Full Message
to all nodes performed in 0 ms.
> [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Finish exchange future
[startVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], resVer=AffinityTopologyVersion
[topVer=13, minorTopVer=0], err=null]
> [16:54:29,171][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] Skipping
checkpoint (no pages were modified) [checkpointLockWait=0ms, checkpointLockHoldTime=486ms,
reason='timeout']
> [16:54:29,316][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Detecting lost partitions
performed in 407 ms.
> {noformat}
> This method is invoked two times:
>  # GridDhtPartitionsExchangeFuture#finishExchangeOnCoordinator()
>  # GridDhtPartitionsExchangeFuture#onDone()
> Do we really need to perform Detecting lost partitions twice?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message