From issues-return-89826-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Wed Jan 30 15:48:05 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id DB384180652 for ; Wed, 30 Jan 2019 16:48:04 +0100 (CET) Received: (qmail 30126 invoked by uid 500); 30 Jan 2019 15:48:04 -0000 Mailing-List: contact issues-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list issues@ignite.apache.org Received: (qmail 30117 invoked by uid 99); 30 Jan 2019 15:48:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jan 2019 15:48:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9117CC0366 for ; Wed, 30 Jan 2019 15:48:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id ChFkF35qbsrg for ; Wed, 30 Jan 2019 15:48:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 56364623E0 for ; Wed, 30 Jan 2019 15:48:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6E23CE265B for ; Wed, 30 Jan 2019 15:48:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 15F3C243B4 for ; Wed, 30 Jan 2019 15:48:00 +0000 (UTC) Date: Wed, 30 Jan 2019 15:48:00 +0000 (UTC) From: "Andrew Mashenkov (JIRA)" To: issues@ignite.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (IGNITE-11148) PartitionCountersNeighborcastFuture blocks partition map exchange MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/IGNITE-11148?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Mashenkov updated IGNITE-11148: -------------------------------------- Fix Version/s: 2.8 > PartitionCountersNeighborcastFuture blocks partition map exchange > ------------------------------------------------------------------ > > Key: IGNITE-11148 > URL: https://issues.apache.org/jira/browse/IGNITE-11148 > Project: Ignite > Issue Type: Bug > Components: mvcc > Reporter: Stepachev Maksim > Priority: Major > Labels: mvcc_stabilization_stage_1 > Fix For: 2.8 > > > We researched a problem with "execution timeout" in Continuous Query 2 fo= r *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailove= r*. The investigation result showed that we got MVCC problem, as result the= test blocks at *getAndPut*, because in some moment wrong behavior happened= : > {code:java} > [16:02:56] :=C2=A0 =C2=A0 =C2=A0[Step 4/5] [2019-01-30 13:02:56,923][INFO= ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTe= st0%][IgniteTxManager] Finishing prepared transaction [commit=3Dfalse, tx= =3DGridDhtTxRemote [nearNodeId=3D6a8546ab-f09d-4b0c-91c1-5fcf5b900004, rmtF= utId=3D95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=3DGridC= acheVersion [topVer=3D160333378, order=3D1548853376060, nodeOrder=3D5], sto= reWriteThrough=3Dfalse, super=3DGridDistributedTxRemoteAdapter [explicitVer= s=3Dnull, started=3Dtrue, commitAllowed=3D0, txState=3DIgniteTxRemoteStateI= mpl [readMap=3DEmptyMap {}, writeMap=3DConcurrentLinkedHashMap {}], txLbl= =3Dnull, super=3DIgniteTxAdapter [xidVer=3DGridCacheVersion [topVer=3D16033= 3378, order=3D1548853376061, nodeOrder=3D3], writeVer=3DGridCacheVersion [t= opVer=3D160333378, order=3D1548853376062, nodeOrder=3D3], implicit=3Dfalse,= loc=3Dfalse, threadId=3D21, startTime=3D1548853376731, nodeId=3D3e6881c0-1= e96-42a9-8bd1-55d344c00002, startVer=3DGridCacheVersion [topVer=3D160333378= , order=3D1548853376060, nodeOrder=3D1], endVer=3Dnull, isolation=3DREPEATA= BLE_READ, concurrency=3DPESSIMISTIC, timeout=3D0, sysInvalidate=3Dfalse, sy= s=3Dfalse, plc=3D2, commitVer=3DGridCacheVersion [topVer=3D160333378, order= =3D1548853376061, nodeOrder=3D3], finalizing=3DNONE, invalidParts=3Dnull, s= tate=3DPREPARED, timedOut=3Dfalse, topVer=3DAffinityTopologyVersion [topVer= =3D7, minorTopVer=3D0], mvccSnapshot=3DMvccSnapshotWithoutTxs [crdVer=3D154= 8853371043, cntr=3D207, cleanupVer=3D204, opCntr=3D0], skipCompletedVers=3D= false, parentTx=3Dnull, duration=3D191ms, onePhaseCommit=3Dfalse]]]]{code} > and after that: > {code:java} > [16:02:56] :=C2=A0 =C2=A0 =C2=A0[Step 4/5] [2019-01-30 13:02:56,931][INFO= ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTe= st0%][recovery] Starting delivery partition countres to remote nodes [txId= =3DGridCacheVersion [topVer=3D160333378, order=3D1548853376060, nodeOrder= =3D5], futId=3D82cfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4{code} > _!IMPORTANT - we work with PartitionCountersNeighborcastFuture which *doe= sn't provide status information* (monitoring)._ > One of possible position of the problem: PartitionCountersNeighborcastFut= ure.onNodeLeft=C2=A0 > As result we have the transaction in *state=3DPREPARED* and *completionTi= me=3D0* which never complete : > =C2=A0 > {code:java} > [16:03:16]W: [org.apache.ignite:ignite-indexing] [2019-01-30 13:03:16,776= ][WARN ][exchange-worker-#40%continuous.CacheContinuousQueryAsyncFailoverMv= ccTxSelfTest0%][diagnostic] Failed to wait for partition release future [to= pVer=3DAffinityTopologyVersion [topVer=3D8, minorTopVer=3D0], node=3D185191= 19-475a-448f-8c02-ff1f64900000] > LocalTxReleaseFuture [ > topVer=3DAffinityTopologyVersion [topVer=3D8, minorTopVer=3D0],=20 > futures=3D[ > TxFinishFuture [=20 > tx=3DGridDhtTxRemote [ > nearNodeId=3D6a8546ab-f09d-4b0c-91c1-5fcf5b900004, rmtFutId=3D95bfade986= 1-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4,=20 > nearXidVer=3DGridCacheVersion [topVer=3D160333378, order=3D1548853376060= , nodeOrder=3D5], storeWriteThrough=3Dfalse, super=3DGridDistributedTxRemot= eAdapter [explicitVers=3Dnull, started=3Dtrue, commitAllowed=3D0, txState= =3DIgniteTxRemoteStateImpl [readMap=3DEmptyMap {}, writeMap=3DConcurrentLin= kedHashMap {}], txLbl=3Dnull, super=3DIgniteTxAdapter [ > xidVer=3DGridCacheVersion [topVer=3D160333378, order=3D1548853376061, no= deOrder=3D3],=20 > writeVer=3DGridCacheVersion [topVer=3D160333378, order=3D1548853376062, = nodeOrder=3D3], implicit=3Dfalse, loc=3Dfalse, threadId=3D21, startTime=3D1= 548853376731, nodeId=3D3e6881c0-1e96-42a9-8bd1-55d344c00002, startVer=3DGri= dCacheVersion [topVer=3D160333378, order=3D1548853376060, nodeOrder=3D1], e= ndVer=3Dnull, isolation=3DREPEATABLE_READ, concurrency=3DPESSIMISTIC, timeo= ut=3D0, sysInvalidate=3Dfalse, sys=3Dfalse, plc=3D2, commitVer=3DGridCacheV= ersion [topVer=3D160333378, order=3D1548853376061, nodeOrder=3D3], finalizi= ng=3DRECOVERY_FINISH, invalidParts=3Dnull, state=3DPREPARED, timedOut=3Dfal= se, topVer=3DAffinityTopologyVersion [topVer=3D7, minorTopVer=3D0], mvccSna= pshot=3DMvccSnapshotWithoutTxs [crdVer=3D1548853371043, cntr=3D207, cleanup= Ver=3D204, opCntr=3D0], skipCompletedVers=3Dfalse, parentTx=3Dnull, duratio= n=3D20048ms, onePhaseCommit=3Dfalse]]], completionTime=3D0, duration=3D2004= 8] > {code} > =C2=A0 > =C2=A0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)