ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Pavlukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-11148) PartitionCountersNeighborcastFuture blocks partition map exchange
Date Fri, 01 Feb 2019 10:12:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758168#comment-16758168
] 

Ivan Pavlukhin commented on IGNITE-11148:
-----------------------------------------

A flaw in compound {{PartitionCountersNeighborcastFuture}} initialization was found. A response
from remote node might have been received before mini future was added to parent. As a result
a mini future and therefore a parent future was never completed.

>  PartitionCountersNeighborcastFuture blocks partition map exchange
> ------------------------------------------------------------------
>
>                 Key: IGNITE-11148
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11148
>             Project: Ignite
>          Issue Type: Bug
>          Components: mvcc
>            Reporter: Stepachev Maksim
>            Assignee: Ivan Pavlukhin
>            Priority: Major
>              Labels: Faillover, Hanging, Transactions, mvcc_stabilization_stage_1
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We researched a problem with "execution timeout" in Continuous Query 2 for *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*.
The investigation result showed that we got MVCC problem, as result the test blocks at *getAndPut*,
because in some moment wrong behavior happened:
> {code:java}
> [16:02:56] :     [Step 4/5] [2019-01-30 13:02:56,923][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][IgniteTxManager]
Finishing prepared transaction [commit=false, tx=GridDhtTxRemote [nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b900004,
rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=GridCacheVersion [topVer=160333378,
order=1548853376060, nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter
[explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl [readMap=EmptyMap
{}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [xidVer=GridCacheVersion
[topVer=160333378, order=1548853376061, nodeOrder=3], writeVer=GridCacheVersion [topVer=160333378,
order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731,
nodeId=3e6881c0-1e96-42a9-8bd1-55d344c00002, startVer=GridCacheVersion [topVer=160333378,
order=1548853376060, nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC,
timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=160333378,
order=1548853376061, nodeOrder=3], finalizing=NONE, invalidParts=null, state=PREPARED, timedOut=false,
topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs
[crdVer=1548853371043, cntr=207, cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null,
duration=191ms, onePhaseCommit=false]]]]{code}
> and after that:
> {code:java}
> [16:02:56] :     [Step 4/5] [2019-01-30 13:02:56,931][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][recovery]
Starting delivery partition countres to remote nodes [txId=GridCacheVersion [topVer=160333378,
order=1548853376060, nodeOrder=5], futId=82cfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4{code}
> _!IMPORTANT - we work with PartitionCountersNeighborcastFuture which *doesn't provide
status information* (monitoring)._
> One of possible position of the problem: PartitionCountersNeighborcastFuture.onNodeLeft 
> As result we have the transaction in *state=PREPARED* and *completionTime=0* which never
complete :
>  
> {code:java}
> [16:03:16]W: [org.apache.ignite:ignite-indexing] [2019-01-30 13:03:16,776][WARN ][exchange-worker-#40%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][diagnostic]
Failed to wait for partition release future [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0],
node=18519119-475a-448f-8c02-ff1f64900000]
> LocalTxReleaseFuture [
>  topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], 
>  futures=[
>  TxFinishFuture [ 
>  tx=GridDhtTxRemote [
>  nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b900004, rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4,

>  nearXidVer=GridCacheVersion [topVer=160333378, order=1548853376060, nodeOrder=5], storeWriteThrough=false,
super=GridDistributedTxRemoteAdapter [explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl
[readMap=EmptyMap {}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter
[
>  xidVer=GridCacheVersion [topVer=160333378, order=1548853376061, nodeOrder=3], 
>  writeVer=GridCacheVersion [topVer=160333378, order=1548853376062, nodeOrder=3], implicit=false,
loc=false, threadId=21, startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c00002,
startVer=GridCacheVersion [topVer=160333378, order=1548853376060, nodeOrder=1], endVer=null,
isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false,
plc=2, commitVer=GridCacheVersion [topVer=160333378, order=1548853376061, nodeOrder=3], finalizing=RECOVERY_FINISH,
invalidParts=null, state=PREPARED, timedOut=false, topVer=AffinityTopologyVersion [topVer=7,
minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, cleanupVer=204,
opCntr=0], skipCompletedVers=false, parentTx=null, duration=20048ms, onePhaseCommit=false]]],
completionTime=0, duration=20048]
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message