ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Plekhanov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (IGNITE-8443) Flaky failure of IgniteCacheClientNodeChangingTopologyTest.testPessimisticTxPutAllMultinode
Date Mon, 07 May 2018 09:02:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-8443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465642#comment-16465642
] 

Aleksey Plekhanov edited comment on IGNITE-8443 at 5/7/18 9:01 AM:
-------------------------------------------------------------------

Main reason of this behavior: transaction hangs when some error occurs during processing of
{{GridNearLockRequest}}. In {{testPessimisticTxPutAllMultinode}} after rebalancing minor topology
version changed, partition for primary key changes state to {{RENTING}}. When we try to update
data in this partition exception is thrown, but response with error is not sending to transaction
initiating node.

Another simple reproducer for this case:

{code:java}
    @Override protected IgniteConfiguration getConfiguration(final String igniteInstanceName)
throws Exception {
        return super.getConfiguration(igniteInstanceName)
            .setCacheConfiguration(
                new CacheConfiguration()
                    .setName(DEFAULT_CACHE_NAME)
                    .setCacheMode(CacheMode.PARTITIONED)
                    .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)
            )
            .setEventStorageSpi(
                new NoopEventStorageSpi() {
                    @Override public void record(Event evt) throws IgniteSpiException {
                        if (evt.type() == EVT_CACHE_ENTRY_CREATED && getTestIgniteInstanceIndex(igniteInstanceName)
== 1)
                            throw new CacheException();
                    }
                }
            );
    }

    public void testTxFailure() throws Exception {
        startGrids(2);

        IgniteCache cache0 = grid(0).cache(DEFAULT_CACHE_NAME);
        IgniteCache cache1 = grid(1).cache(DEFAULT_CACHE_NAME);

        grid(0).transactions().txStart(TransactionConcurrency.PESSIMISTIC, TransactionIsolation.REPEATABLE_READ);
        cache0.put(primaryKey(cache1), 0);
    }
{code}



was (Author: alex_pl):
Main reason of this behavior: transaction hangs when some error occurs during processing of
{{GridNearLockRequest}} (In {{testPessimisticTxPutAllMultinode}} after rebalancing changing
minor topology version and partition for primary key change state to {{RENTING}} and exception
is thrown when we try to update data in this partition). Response with error is not sending
to transaction initiating node.

Another simple reproducer for this case:

{code:java}
    @Override protected IgniteConfiguration getConfiguration(final String igniteInstanceName)
throws Exception {
        return super.getConfiguration(igniteInstanceName)
            .setCacheConfiguration(
                new CacheConfiguration()
                    .setName(DEFAULT_CACHE_NAME)
                    .setCacheMode(CacheMode.PARTITIONED)
                    .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)
            )
            .setEventStorageSpi(
                new NoopEventStorageSpi() {
                    @Override public void record(Event evt) throws IgniteSpiException {
                        if (evt.type() == EVT_CACHE_ENTRY_CREATED && getTestIgniteInstanceIndex(igniteInstanceName)
== 1)
                            throw new CacheException();
                    }
                }
            );
    }

    public void testTxFailure() throws Exception {
        startGrids(2);

        IgniteCache cache0 = grid(0).cache(DEFAULT_CACHE_NAME);
        IgniteCache cache1 = grid(1).cache(DEFAULT_CACHE_NAME);

        grid(0).transactions().txStart(TransactionConcurrency.PESSIMISTIC, TransactionIsolation.REPEATABLE_READ);
        cache0.put(primaryKey(cache1), 0);
    }
{code}


> Flaky failure of IgniteCacheClientNodeChangingTopologyTest.testPessimisticTxPutAllMultinode
> -------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8443
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8443
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Aleksey Plekhanov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
>
> Test fails on TC sometimes (failure rate: 30%) with the following error:
> {noformat}
> junit.framework.AssertionFailedError: Failed to wait for update.
>     at org.apache.ignite.internal.processors.cache.distributed.IgniteCacheClientNodeChangingTopologyTest.multinode(IgniteCacheClientNodeChangingTopologyTest.java:1855)
>     at org.apache.ignite.internal.processors.cache.distributed.IgniteCacheClientNodeChangingTopologyTest.testPessimisticTxPutAllMultinode(IgniteCacheClientNodeChangingTopologyTest.java:1673)
> {noformat}
> Each time some seconds prior to failure there is error in log:
> {noformat}
> [ERROR][sys-stripe-10-#90529%distributed.IgniteCacheClientNodeChangingTopologyTest0%][GridDhtColocatedCache]
<default> Failed to unmarshal at least one of the keys for lock request message: GridNearLockRequest
[topVer=AffinityTopologyVersion [topVer=10, minorTopVer=0], miniId=1, dhtVers=[...], subjId=5ad87047-5d80-4530-bb48-f7c268400006,
taskNameHash=0, createTtl=-1, accessTtl=-1, flags=6, filter=null, super=GridDistributedLockRequest
[nodeId=5ad87047-5d80-4530-bb48-f7c268400006, nearXidVer=GridCacheVersion [topVer=136730132,
order=1525250131532, nodeOrder=7], threadId=100107, futId=3e2912f2361-94bff164-8062-4fb4-8d85-c2e89e579148,
timeout=0, isInTx=true, isInvalidate=false, isRead=false, isolation=REPEATABLE_READ, retVals=[...],
txSize=0, flags=0, keysCnt=94, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=136730132,
order=1525250131532, nodeOrder=7], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage
[cacheId=1544803905]]]]
>  class org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtInvalidPartitionException
[part=54, msg=Adding entry to partition that is concurrently evicted [grp=default, part=54,
shouldBeMoving=, belongs=true, topVer=AffinityTopologyVersion [topVer=10, minorTopVer=0],
curTopVer=AffinityTopologyVersion [topVer=10, minorTopVer=1]]]
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:923)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:798)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.localPartition(GridCachePartitionedConcurrentMap.java:69)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.putEntryIfObsoleteOrAbsent(GridCachePartitionedConcurrentMap.java:88)
>  	at org.apache.ignite.internal.processors.cache.GridCacheAdapter.entryEx(GridCacheAdapter.java:955)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.entryEx(GridDhtCacheAdapter.java:525)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.entryExx(GridDhtCacheAdapter.java:545)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.lockAllAsync(GridDhtTransactionalCacheAdapter.java:987)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.processNearLockRequest0(GridDhtTransactionalCacheAdapter.java:667)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.access$800(GridDhtTransactionalCacheAdapter.java:94)
>  	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$12$1.run(GridDhtTransactionalCacheAdapter.java:704)
>  	at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:511)
>  	at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message