ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shaun Mcginnity <shaun.mcginn...@gmail.com>
Subject Cache atomic conditional remove returns error when node leaves cluster
Date Fri, 15 Apr 2016 11:11:16 GMT
Hi,

I'm attempting to create a distributed locking mechanism using a
distributed cache with the atomic putIfAbsent(key, value) and conditional
remove(key, value) operations.

The cache configuration is as follows:

CacheConfiguration [name=lock_data0, storeConcurrentLoadAllThreshold=5,
rebalancePoolSize=2, rebalanceTimeout=10000, evictPlc=FifoEvictionPolicy
[max=10, batchSize=1, maxMemSize=0, memSize=0], evictSync=false,
evictKeyBufSize=1024, evictSyncConcurrencyLvl=4, evictSyncTimeout=10000,
evictFilter=null, evictMaxOverflowRatio=10.0, eagerTtl=true,
dfltLockTimeout=0, startSize=1500000, nearCfg=null, writeSync=FULL_SYNC,
storeFactory=null, storeKeepBinary=false, loadPrevVal=false,
aff=org.apache.ignite.cache.affinity.fair.FairAffinityFunction@4e31276e,
cacheMode=PARTITIONED, atomicityMode=ATOMIC, atomicWriteOrderMode=null,
backups=1, invalidate=false, tmLookupClsName=null, rebalanceMode=ASYNC,
rebalanceOrder=0, rebalanceBatchSize=524288,
rebalanceBatchesPrefetchCount=2, offHeapMaxMem=0, swapEnabled=false,
maxConcurrentAsyncOps=500, writeBehindEnabled=false,
writeBehindFlushSize=10240, writeBehindFlushFreq=5000,
writeBehindFlushThreadCnt=1, writeBehindBatchSize=512,
memMode=OFFHEAP_TIERED, affMapper=null, rebalanceDelay=0,
rebalanceThrottle=0, interceptor=null, longQryWarnTimeout=3000,
readFromBackup=false, nodeFilter=null, sqlSchema=null, sqlEscapeAll=false,
sqlOnheapRowCacheSize=10240, snapshotableIdx=false, cpOnRead=true,
topValidator=null]

The code to put is:

while(lockCache.putIfAbsent(entry, lockId) == false) {
try {
attempt++;
Thread.sleep(2);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

I have wrapped the code to remove in a loop to monitor the error:

while(attempt < maxAttempts && lockCache.remove(entry, lockId) == false) {
String v = (String) lockCache.get(entry);
System.err.println("ERROR : " + entry + " : lock invalid " + (v == null ?
"null" : v) + " expecting " + lockId);
try {
Thread.sleep(2);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
attempt++;
}

I have 4 nodes in the cluster. If one node shuts down (e.g. after ctrl-c)
then I get a small number of errors trying to conditionally remove a key
just at the point when the node is detected to have closed:

INFO: Node left topology: TcpDiscoveryNode
[id=fcea159d-2183-45b1-a985-1fb1ffa8b552, addrs=[0:0:0:0:0:0:0:1%lo,
10.20.50.160, 10.20.75.17, 127.0.0.1,
2a00:2381:757:50:2e76:8aff:fe57:3714%eth2,
2a00:2381:757:75:ea39:35ff:fec4:6d98%eth0],
sockAddrs=[/2a00:2381:757:50:2e76:8aff:fe57:3714%eth2:47503,
/0:0:0:0:0:0:0:1%lo:47503,
bfs-dl380pg8-03.bfs.openwave.com/10.20.50.160:47503, /10.20.50.160:47503,
/2a00:2381:757:75:ea39:35ff:fec4:6d98%eth0:47503, /10.20.75.17:47503,
bfs-dl380pg8-03t.bfs.openwave.com/10.20.75.17:47503, /127.0.0.1:47503,
/2a00:2381:757:50:2e76:8aff:fe57:3714%eth2:47503,
/2a00:2381:757:75:ea39:35ff:fec4:6d98%eth0:47503], discPort=47503, order=4,
intOrder=4, lastExchangeTime=1460716761999, loc=false,
ver=1.5.0#20151229-sha1:f1f8cda2, isClient=false]
Apr 15, 2016 11:41:17 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Topology snapshot [ver=5, servers=3, clients=0, CPUs=32, heap=6.0GB]
ERROR : 0_k0001162115 : lock invalid null expecting 428696898473974
ERROR : 0_k0001132312 : lock invalid null expecting 428696898473973

So remove(entry, lockId) returns false even though the putIfAbsent was
successful.  I don't see any exception being thrown by remove.

Is there an explanation for this, and any workaround?

Regards,

Shaun

Mime
View raw message