ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Kosarev (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-10589) Multiple server node failure after a client node stopping
Date Fri, 07 Dec 2018 09:32:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-10589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Kosarev updated IGNITE-10589:
------------------------------------
    Description: 
after stopping a client
we see  topology change and pme finish on the coordinator, 
and at soon on another nodes we still don't see new topology, but have 
Critical error resulting nodes failure
crd log
{code}
2018-12-06 15:55:23.660 [WARN ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Node FAILED: ZookeeperClusterNode [id=979f03db-f858-44f6-8646-12034dfd5c93, addrs=[10.116.206.1], order=129, loc=false, client=true]
2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Topology snapshot [ver=162, servers=128, clients=0, CPUs=7168, offheap=140000.0GB, heap=4000.0GB]
2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Node [id=44D27930-80E5-4EB7-B377-8B07C02C2033, clusterState=ACTIVE]
2018-12-06 15:55:23.660 [INFO ][zk-DPL_GRID%DplGridNodeName-EventThread][o.a.i.s.d.z.i.ZookeeperDiscoveryImpl] Process alive nodes change [alives=128]
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Baseline [id=0, size=128, online=128, offline=0]
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Data Regions Configured:
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- dpl_mem_plc [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=true]
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- not-persisted [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=false]
2018-12-06 15:55:23.670 [DEBUG][sys-#564%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager] Process node left 979f03db-f858-44f6-8646-12034dfd5c93
2018-12-06 15:55:23.670 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Started exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true, evt=NODE_FAILED, evtNode=979f03db-f858-44f6-8646-12034dfd5c93, customEvt=null, allowMerge=true]
2018-12-06 15:55:23.712 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true]
2018-12-06 15:55:23.699 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], err=null]
{code}

on a node(1) we have critical error(1)
{code}
2018-12-06 15:55:23.727 [ERROR][utility-#432%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
 futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=105, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrd
er=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=15557137
4, order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCa
cheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurat
ion [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1
a8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-33192
9065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0
-4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61
abf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38
b2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7
a61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72f
a1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444
-0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b7
147304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d
2f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1,
e347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-
983d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1
, c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e
9-80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed
=1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}]
, hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntr
yPredicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=1545
426660336, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=G
ridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1942, id=219868, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, map
pedNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|
removed=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, part
UpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, c
nt=0, super=GridCacheIdMessage [cacheId=0]]]]]
org.apache.ignite.IgniteException: Failed to resolve nodes topology [cacheGrp=N/A, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], history=[AffinityTopologyVersion [topVer=161, minorTopVer=0]], snap=Snapshot [topVer=AffinityTopologyVersion [topVer
=161, minorTopVer=0]], locNode=ZookeeperClusterNode [id=51dc74ab-c989-4268-b850-ed69a24cca30, addrs=[10.116.206.28], order=161, loc=true, client=false]]
        at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.resolveDiscoCache(GridDiscoveryManager.java:2111)
        at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.consistentId(GridDiscoveryManager.java:1950)
        at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:104)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
        at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}

on many other nodes we have Critical error (2) with unable to find *nodeId=51dc74ab-c989-4268-b850-ed69a24cca30* of the Node(1) and  nearNodeid is CRD:
{code}
2018-12-06 15:55:23.730 [ERROR][utility-#386%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
 futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=79, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrde
r=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=155571374
, order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCac
heObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurati
on [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1a
8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-331929
065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0-
4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61a
bf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38b
2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7a
61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72fa
1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444-
0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b71
47304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d2
f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1, e
347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-9
83d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1,
 c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e9
-80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed=
1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}],
 hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntry
Predicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=15454
44833545, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=Gr
idCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1937, id=210410, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, mapp
edNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|r
emoved=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partU
pdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, cn
t=0, super=GridCacheIdMessage [cacheId=0]]]]]
java.lang.IllegalStateException: Unable to find consistentId by UUID [nodeId=51dc74ab-c989-4268-b850-ed69a24cca30, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0]]
        at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactId(ConsistentIdMapper.java:62)
        at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:123)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
        at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}


some details I noticed
therea no diagnostic Metrics for local node messages in logs of 
Node(1) while we have thread grid-timeout-worker in thread dump 
Thread [name="grid-timeout-worker-#119%DPL_GRID%DplGridNodeName%", id=366, state=TIMED_WAITING, blockCnt=2, waitCnt=247178]
    Lock [object=java.lang.Object@682fdbd8, ownerName=null, ownerId=-1]
        at java.lang.Object.wait(Native Method)
        at o.a.i.i.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:258)
        at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:745)


  was:
after stopping a client
we see  topology change and pme finish on the coordinator, 
and at soon on another nodes we still don't see new topology, but have 
Critical error resulting nodes failure
crd log
{code}
2018-12-06 15:55:23.660 [WARN ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Node FAILED: ZookeeperClusterNode [id=979f03db-f858-44f6-8646-12034dfd5c93, addrs=[10.116.206.1], order=129, loc=false, client=true]
2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Topology snapshot [ver=162, servers=128, clients=0, CPUs=7168, offheap=140000.0GB, heap=4000.0GB]
2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Node [id=44D27930-80E5-4EB7-B377-8B07C02C2033, clusterState=ACTIVE]
2018-12-06 15:55:23.660 [INFO ][zk-DPL_GRID%DplGridNodeName-EventThread][o.a.i.s.d.z.i.ZookeeperDiscoveryImpl] Process alive nodes change [alives=128]
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Baseline [id=0, size=128, online=128, offline=0]
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Data Regions Configured:
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- dpl_mem_plc [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=true]
2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- not-persisted [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=false]
2018-12-06 15:55:23.670 [DEBUG][sys-#564%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager] Process node left 979f03db-f858-44f6-8646-12034dfd5c93
2018-12-06 15:55:23.670 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Started exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true, evt=NODE_FAILED, evtNode=979f03db-f858-44f6-8646-12034dfd5c93, customEvt=null, allowMerge=true]
2018-12-06 15:55:23.712 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true]
2018-12-06 15:55:23.699 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], err=null]
{code}

on a node(1) we have critical error(1)
{code}
2018-12-06 15:55:23.727 [ERROR][utility-#432%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
 futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=105, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrd
er=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=15557137
4, order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCa
cheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurat
ion [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1
a8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-33192
9065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0
-4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61
abf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38
b2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7
a61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72f
a1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444
-0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b7
147304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d
2f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1,
e347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-
983d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1
, c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e
9-80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed
=1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}]
, hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntr
yPredicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=1545
426660336, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=G
ridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1942, id=219868, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, map
pedNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|
removed=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, part
UpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, c
nt=0, super=GridCacheIdMessage [cacheId=0]]]]]
org.apache.ignite.IgniteException: Failed to resolve nodes topology [cacheGrp=N/A, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], history=[AffinityTopologyVersion [topVer=161, minorTopVer=0]], snap=Snapshot [topVer=AffinityTopologyVersion [topVer
=161, minorTopVer=0]], locNode=ZookeeperClusterNode [id=51dc74ab-c989-4268-b850-ed69a24cca30, addrs=[10.116.206.28], order=161, loc=true, client=false]]
        at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.resolveDiscoCache(GridDiscoveryManager.java:2111)
        at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.consistentId(GridDiscoveryManager.java:1950)
        at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:104)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
        at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}

on many other nodes we have Critical error (2) with *nodeId=51dc74ab-c989-4268-b850-ed69a24cca30* of the node above and  nearNodeid is CRD:
{code}
2018-12-06 15:55:23.730 [ERROR][utility-#386%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
 futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=79, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrde
r=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=155571374
, order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCac
heObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurati
on [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1a
8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-331929
065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0-
4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61a
bf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38b
2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7a
61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72fa
1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444-
0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b71
47304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d2
f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1, e
347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-9
83d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1,
 c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e9
-80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed=
1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}],
 hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntry
Predicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=15454
44833545, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=Gr
idCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1937, id=210410, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, mapp
edNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|r
emoved=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partU
pdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, cn
t=0, super=GridCacheIdMessage [cacheId=0]]]]]
java.lang.IllegalStateException: Unable to find consistentId by UUID [nodeId=51dc74ab-c989-4268-b850-ed69a24cca30, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0]]
        at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactId(ConsistentIdMapper.java:62)
        at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:123)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
        at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
        at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}


some details I noticed
therea no diagnostic Metrics for local node messages in logs of 
Node(1) while we have thread grid-timeout-worker in thread dump 
Thread [name="grid-timeout-worker-#119%DPL_GRID%DplGridNodeName%", id=366, state=TIMED_WAITING, blockCnt=2, waitCnt=247178]
    Lock [object=java.lang.Object@682fdbd8, ownerName=null, ownerId=-1]
        at java.lang.Object.wait(Native Method)
        at o.a.i.i.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:258)
        at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:745)



> Multiple server node failure after a client node stopping
> ---------------------------------------------------------
>
>                 Key: IGNITE-10589
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10589
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Kosarev
>            Priority: Critical
>         Attachments: 16_02.tar
>
>
> after stopping a client
> we see  topology change and pme finish on the coordinator, 
> and at soon on another nodes we still don't see new topology, but have 
> Critical error resulting nodes failure
> crd log
> {code}
> 2018-12-06 15:55:23.660 [WARN ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Node FAILED: ZookeeperClusterNode [id=979f03db-f858-44f6-8646-12034dfd5c93, addrs=[10.116.206.1], order=129, loc=false, client=true]
> 2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Topology snapshot [ver=162, servers=128, clients=0, CPUs=7168, offheap=140000.0GB, heap=4000.0GB]
> 2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Node [id=44D27930-80E5-4EB7-B377-8B07C02C2033, clusterState=ACTIVE]
> 2018-12-06 15:55:23.660 [INFO ][zk-DPL_GRID%DplGridNodeName-EventThread][o.a.i.s.d.z.i.ZookeeperDiscoveryImpl] Process alive nodes change [alives=128]
> 2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Baseline [id=0, size=128, online=128, offline=0]
> 2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Data Regions Configured:
> 2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- dpl_mem_plc [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=true]
> 2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- not-persisted [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=false]
> 2018-12-06 15:55:23.670 [DEBUG][sys-#564%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager] Process node left 979f03db-f858-44f6-8646-12034dfd5c93
> 2018-12-06 15:55:23.670 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Started exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true, evt=NODE_FAILED, evtNode=979f03db-f858-44f6-8646-12034dfd5c93, customEvt=null, allowMerge=true]
> 2018-12-06 15:55:23.712 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true]
> 2018-12-06 15:55:23.699 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], err=null]
> {code}
> on a node(1) we have critical error(1)
> {code}
> 2018-12-06 15:55:23.727 [ERROR][utility-#432%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
>  futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=105, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrd
> er=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=15557137
> 4, order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCa
> cheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurat
> ion [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1
> a8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-33192
> 9065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0
> -4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61
> abf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38
> b2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7
> a61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72f
> a1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444
> -0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b7
> 147304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d
> 2f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1,
> e347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-
> 983d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1
> , c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e
> 9-80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed
> =1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}]
> , hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntr
> yPredicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=1545
> 426660336, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=G
> ridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1942, id=219868, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, map
> pedNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|
> removed=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, part
> UpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, c
> nt=0, super=GridCacheIdMessage [cacheId=0]]]]]
> org.apache.ignite.IgniteException: Failed to resolve nodes topology [cacheGrp=N/A, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], history=[AffinityTopologyVersion [topVer=161, minorTopVer=0]], snap=Snapshot [topVer=AffinityTopologyVersion [topVer
> =161, minorTopVer=0]], locNode=ZookeeperClusterNode [id=51dc74ab-c989-4268-b850-ed69a24cca30, addrs=[10.116.206.28], order=161, loc=true, client=false]]
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.resolveDiscoCache(GridDiscoveryManager.java:2111)
>         at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.consistentId(GridDiscoveryManager.java:1950)
>         at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:104)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
>         at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
>         at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> on many other nodes we have Critical error (2) with unable to find *nodeId=51dc74ab-c989-4268-b850-ed69a24cca30* of the Node(1) and  nearNodeid is CRD:
> {code}
> 2018-12-06 15:55:23.730 [ERROR][utility-#386%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
>  futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=79, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrde
> r=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=155571374
> , order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCac
> heObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurati
> on [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1a
> 8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-331929
> 065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0-
> 4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61a
> bf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38b
> 2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7a
> 61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72fa
> 1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444-
> 0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b71
> 47304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d2
> f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1, e
> 347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-9
> 83d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1,
>  c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e9
> -80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed=
> 1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}],
>  hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntry
> Predicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=15454
> 44833545, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=Gr
> idCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1937, id=210410, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, mapp
> edNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|r
> emoved=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partU
> pdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, cn
> t=0, super=GridCacheIdMessage [cacheId=0]]]]]
> java.lang.IllegalStateException: Unable to find consistentId by UUID [nodeId=51dc74ab-c989-4268-b850-ed69a24cca30, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0]]
>         at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactId(ConsistentIdMapper.java:62)
>         at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:123)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
>         at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
>         at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
>         at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
>         at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> some details I noticed
> therea no diagnostic Metrics for local node messages in logs of 
> Node(1) while we have thread grid-timeout-worker in thread dump 
> Thread [name="grid-timeout-worker-#119%DPL_GRID%DplGridNodeName%", id=366, state=TIMED_WAITING, blockCnt=2, waitCnt=247178]
>     Lock [object=java.lang.Object@682fdbd8, ownerName=null, ownerId=-1]
>         at java.lang.Object.wait(Native Method)
>         at o.a.i.i.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:258)
>         at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message