ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From smurphy <smur...@trustwave.com>
Subject Grid freezing
Date Thu, 19 Oct 2017 21:24:56 GMT
threaddump.tdump
<http://apache-ignite-users.70518.x6.nabble.com/file/t1317/threaddump.tdump>  

I am using Ignite v2.1 and my code using Optimistic/Serializable
transactions and is locking up. When it does, there are a lot of `WARNING:
Found long running transaction` and `WARNING: Found long running cache
future` messages in the logs (see below). Eventually the grid freezes and
there are a lot of "IgniteInterruptedException: Got interrupted while
waiting for future to complete." (see below). Also, see attached thread dump
when the grid is finally stopped.

Using visor, I could see the active threads increase until all are consumed.
I boosted the thread pool count to 100 and saw the active threads rise
steadily until all were consumed. At theat point, the grid freezes up.

Using visor ping multiple times, the servers always ping successfully but
the client fails intermittently. 

LONG RUNNING TRANSACTION:

2017-10-19 13:03:06,698 WARN  [ ] grid-timeout-worker-#15%null%  
{Log4JLogger.java:480} - Found long running transaction
[startTime=12:30:33.985, curTime=13:03:06.682, tx=GridNearTxLocal
[mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=[IgniteTxEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], cacheId=1481046058, txKey=IgniteTxKey
[key=KeyCacheObjectImpl [part=126, val=GridCacheInternalKeyImpl
[name=SCAN_IDX, grpName=default-ds-group], hasValBytes=true],
cacheId=1481046058], val=[op=TRANSFORM, val=null], prevVal=[op=NOOP,
val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=[IgniteBiTuple
[val1=GetAndIncrementProcessor [], val2=[Ljava.lang.Object;@49d9dc6c]],
ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null,
dhtVer=null, filters=[], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], val=null, startVer=1508434155232, ver=GridCacheVersion
[topVer=119473256, order=1508434155232, nodeOrder=2243], hash=-375255214,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=36b4d422-d011-4f77-919a-b8ffb089614b, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=119473256,
order=1508434155230, nodeOrder=2243]]], explicitLock=false, dhtVer=null,
last=false, nearEntries=0, clientFirst=false,
node=36b4d422-d011-4f77-919a-b8ffb089614b]], nearLocallyMapped=false,
colocatedLocallyMapped=false, needCheckBackup=null, hasRemoteLocks=false,
thread=<failed to find active thread 109>,
mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=[IgniteTxEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], cacheId=1481046058, txKey=IgniteTxKey
[key=KeyCacheObjectImpl [part=126, val=GridCacheInternalKeyImpl
[name=SCAN_IDX, grpName=default-ds-group], hasValBytes=true],
cacheId=1481046058], val=[op=TRANSFORM, val=null], prevVal=[op=NOOP,
val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=[IgniteBiTuple
[val1=GetAndIncrementProcessor [], val2=[Ljava.lang.Object;@49d9dc6c]],
ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null,
dhtVer=null, filters=[], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], val=null, startVer=1508434155232, ver=GridCacheVersion
[topVer=119473256, order=1508434155232, nodeOrder=2243], hash=-375255214,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=36b4d422-d011-4f77-919a-b8ffb089614b, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=119473256,
order=1508434155230, nodeOrder=2243]]], explicitLock=false, dhtVer=null,
last=false, nearEntries=0, clientFirst=false,
node=36b4d422-d011-4f77-919a-b8ffb089614b]], super=GridDhtTxLocalAdapter
[nearOnOriginatingNode=false, nearNodes=[], dhtNodes=[], explicitLock=false,
super=IgniteTxLocalAdapter [completedBase=null, sndTransformedVals=false,
depEnabled=false, txState=IgniteTxImplicitSingleStateImpl [init=true,
recovery=false], super=IgniteTxAdapter [xidVer=GridCacheVersion
[topVer=119473256, order=1508434155230, nodeOrder=2243], writeVer=null,
implicit=true, loc=true, threadId=109, startTime=1508434233985,
nodeId=cb0332bd-1c18-4695-a75f-0d0ed4637b55, startVer=GridCacheVersion
[topVer=119473256, order=1508434155230, nodeOrder=2243], endVer=null,
isolation=READ_COMMITTED, concurrency=OPTIMISTIC, timeout=0,
sysInvalidate=false, sys=true, plc=2, commitVer=GridCacheVersion
[topVer=119473256, order=1508434155230, nodeOrder=2243], finalizing=NONE,
invalidParts=null, state=PREPARING, timedOut=false,
topVer=AffinityTopologyVersion [topVer=2243, minorTopVer=0],
duration=1952707ms, onePhaseCommit=true], size=1]]]]

LONG RUNNING CACHE FUTURE:
2017-10-19 13:03:06,709 WARN  [ ] grid-timeout-worker-#15%null%  
{Log4JLogger.java:480} - Found long running cache future
[startTime=12:30:33.985, curTime=13:03:06.682, fut=GridNearTxFinishFuture
[futId=d7d7ea53f51-2939d9ae-16bb-41f2-a756-32150fde4ecf, tx=GridNearTxLocal
[mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=[IgniteTxEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], cacheId=1481046058, txKey=IgniteTxKey
[key=KeyCacheObjectImpl [part=126, val=GridCacheInternalKeyImpl
[name=SCAN_IDX, grpName=default-ds-group], hasValBytes=true],
cacheId=1481046058], val=[op=TRANSFORM, val=null], prevVal=[op=NOOP,
val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=[IgniteBiTuple
[val1=GetAndIncrementProcessor [], val2=[Ljava.lang.Object;@49d9dc6c]],
ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null,
dhtVer=null, filters=[], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], val=null, startVer=1508434155232, ver=GridCacheVersion
[topVer=119473256, order=1508434155232, nodeOrder=2243], hash=-375255214,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=36b4d422-d011-4f77-919a-b8ffb089614b, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=119473256,
order=1508434155230, nodeOrder=2243]]], explicitLock=false, dhtVer=null,
last=false, nearEntries=0, clientFirst=false,
node=36b4d422-d011-4f77-919a-b8ffb089614b]], nearLocallyMapped=false,
colocatedLocallyMapped=false, needCheckBackup=null, hasRemoteLocks=false,
thread=<failed to find active thread 109>,
mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=[IgniteTxEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], cacheId=1481046058, txKey=IgniteTxKey
[key=KeyCacheObjectImpl [part=126, val=GridCacheInternalKeyImpl
[name=SCAN_IDX, grpName=default-ds-group], hasValBytes=true],
cacheId=1481046058], val=[op=TRANSFORM, val=null], prevVal=[op=NOOP,
val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=[IgniteBiTuple
[val1=GetAndIncrementProcessor [], val2=[Ljava.lang.Object;@49d9dc6c]],
ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null,
dhtVer=null, filters=[], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], val=null, startVer=1508434155232, ver=GridCacheVersion
[topVer=119473256, order=1508434155232, nodeOrder=2243], hash=-375255214,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=36b4d422-d011-4f77-919a-b8ffb089614b, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=119473256,
order=1508434155230, nodeOrder=2243]]], explicitLock=false, dhtVer=null,
last=false, nearEntries=0, clientFirst=false,
node=36b4d422-d011-4f77-919a-b8ffb089614b]], super=GridDhtTxLocalAdapter
[nearOnOriginatingNode=false, nearNodes=[], dhtNodes=[], explicitLock=false,
super=IgniteTxLocalAdapter [completedBase=null, sndTransformedVals=false,
depEnabled=false, txState=IgniteTxImplicitSingleStateImpl [init=true,
recovery=false], super=IgniteTxAdapter [xidVer=GridCacheVersion
[topVer=119473256, order=1508434155230, nodeOrder=2243], writeVer=null,
implicit=true, loc=true, threadId=109, startTime=1508434233985,
nodeId=cb0332bd-1c18-4695-a75f-0d0ed4637b55, startVer=GridCacheVersion
[topVer=119473256, order=1508434155230, nodeOrder=2243], endVer=null,
isolation=READ_COMMITTED, concurrency=OPTIMISTIC, timeout=0,
sysInvalidate=false, sys=true, plc=2, commitVer=GridCacheVersion
[topVer=119473256, order=1508434155230, nodeOrder=2243], finalizing=NONE,
invalidParts=null, state=PREPARING, timedOut=false,
topVer=AffinityTopologyVersion [topVer=2243, minorTopVer=0],
duration=1952717ms, onePhaseCommit=true], size=1]]], commit=true,
mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=[IgniteTxEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], cacheId=1481046058, txKey=IgniteTxKey
[key=KeyCacheObjectImpl [part=126, val=GridCacheInternalKeyImpl
[name=SCAN_IDX, grpName=default-ds-group], hasValBytes=true],
cacheId=1481046058], val=[op=TRANSFORM, val=null], prevVal=[op=NOOP,
val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=[IgniteBiTuple
[val1=GetAndIncrementProcessor [], val2=[Ljava.lang.Object;@49d9dc6c]],
ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null,
dhtVer=null, filters=[], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=126,
val=GridCacheInternalKeyImpl [name=SCAN_IDX, grpName=default-ds-group],
hasValBytes=true], val=null, startVer=1508434155232, ver=GridCacheVersion
[topVer=119473256, order=1508434155232, nodeOrder=2243], hash=-375255214,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=36b4d422-d011-4f77-919a-b8ffb089614b, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=119473256,
order=1508434155230, nodeOrder=2243]]], explicitLock=false, dhtVer=null,
last=false, nearEntries=0, clientFirst=false,
node=36b4d422-d011-4f77-919a-b8ffb089614b]], trackable=true,
finishOnePhaseCalled=false, innerFuts=[], super=GridCompoundIdentityFuture
[super=GridCompoundFuture [rdc=AlwaysTrueReducer [], initFlag=0,
lsnrCalls=0, done=false, cancelled=false, err=null, futs=[]]]]]

IGNITECHECKEDEXCEPTION:
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: SEVERE: Failed to execute
job [jobId=ac08ea53f51-cb0332bd-1c18-4695-a75f-0d0ed4637b55,
ses=GridJobSessionImpl [ses=GridTaskSessionImpl
[taskName=com.company.fragment.node.callable.DequeuePortionsCallable,
dep=GridDeployment [ts=1508433959490, depMode=SHARED,
clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6,
clsLdrId=fa68ba53f51-85f1f563-695c-43f9-a263-e9f67e8925b7, userVer=0,
loc=true,
sampleClsName=o.a.i.i.processors.task.GridTaskProcessor$TaskDiscoveryListener$1,
pendingUndeploy=false, undeployed=false, usage=64],
taskClsName=com.company.fragment.node.callable.DequeuePortionsCallable,
sesId=9c08ea53f51-cb0332bd-1c18-4695-a75f-0d0ed4637b55,
startTime=1508434417976, endTime=9223372036854775807,
taskNodeId=cb0332bd-1c18-4695-a75f-0d0ed4637b55,
clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6, closed=false, cpSpi=null,
failSpi=null, loadSpi=null, usage=1, fullSup=false, internal=false,
subjId=cb0332bd-1c18-4695-a75f-0d0ed4637b55, mapFut=IgniteFuture
[orig=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null,
hash=2051666823]], execName=null],
jobId=ac08ea53f51-cb0332bd-1c18-4695-a75f-0d0ed4637b55]]
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: class
org.apache.ignite.IgniteException: class
org.apache.ignite.IgniteInterruptedException: Got interrupted while waiting
for future to complete.
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1850)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:566)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6608)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:560)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:489)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1181)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1908)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
java.lang.Thread.run(Unknown Source)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: Caused by:
javax.cache.CacheException: class
org.apache.ignite.IgniteInterruptedException: Got interrupted while waiting
for future to complete.
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1312)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:2630)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(IgniteCacheProxy.java:1188)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
com.company.fragment.node.ignite.IgnitePortionDequeuer.dequeuePortionIds(IgnitePortionDequeuer.java:276)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
com.company.fragment.node.ignite.IgnitePortionDequeuer._dequeuePortionIds(IgnitePortionDequeuer.java:196)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
com.company.fragment.node.ignite.IgnitePortionDequeuer.dequeuePortions(IgnitePortionDequeuer.java:146)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
com.company.fragment.node.callable.DequeuePortionsCallable.call(DequeuePortionsCallable.java:44)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
com.company.fragment.node.callable.DequeuePortionsCallable.call(DequeuePortionsCallable.java:35)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1847)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: ... 14 more
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: Caused by: class
org.apache.ignite.IgniteInterruptedException: Got interrupted while waiting
for future to complete.
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:796)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: at
org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:794)
Oct 19 13:29:22 prod-srvr apache-ignite[46822]: ... 23 more





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Mime
View raw message