ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Goncharuk (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-6939) Exclude false owners from the execution plan based on query response
Date Thu, 16 Nov 2017 15:47:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexey Goncharuk updated IGNITE-6939:
-------------------------------------
    Description: 
This is related to IGNITE-6858, the fix in the ticket can be improved.

The scenario leading to the issue is as follows:
1) Node A has partition 1 as owning
2) Node B has local partition map which has partition 1 on node A as owning
3) Topology change is triggered which would move partition 1 from A to another node, topology
version is X
4) A transaction is started on node B on topology X
5) Partition is rebalanced and node A moves partition 1 to RENTING and then to EVICTED state,
node A updates it's local partition map.
6) A new topology change is triggered
7) Node A sends partition map (transitively) to the node B, but since there is a pending exchange,
node B ignores the updated map and still thinks that A owns partition 1 [1]
8) transaction attempts to execute an SQL query against partition 1 on node A and retries
infinitely

[1] The related code is in GridDhtPartitionTopologyImpl#update(AffinityTopologyVersion, GridDhtPartitionFullMap,
CachePartitionFullCountersMap, Set, AffinityTopologyVersion)
{code}
if (stopping || !lastTopChangeVer.initialized() ||
    // Ignore message not-related to exchange if exchange is in progress.
    (exchangeVer == null && !lastTopChangeVer.equals(readyTopVer)))
    return false;
{code}

There are two possibilities to fix this:
1) Make all updates to partition map in a single thread, then we will not need update sequences
and then we can update local partition map even when there is a pending exchange (this is
a relatively big, but useful change)
2) Make a change in SQL query execution so that if a node cannot reserve a partition, do not
map the partition to this node on the same topology version anymore (a quick fix)

This will remove the need to throw an exception from SQL query inside transaction when there
is a pending exchange.

  was:
This is related to IGNITE-6858, the fix in the ticket can be improved.

The scenario leading to the issue is as follows:
1) Node A has partition 1 as owning
2) Node B has local partition map which has partition 1 on node A as owning
3) Topology change is triggered which would move partition 1 from A to another node, topology
version is X
4) A transaction is started on node B on topology X
5) Partition is rebalanced and node A moves partition 1 to RENTING and then to EVICTED state,
node A updates it's local partition map.
6) A new topology change is triggered
7) Node A sends partition map (transitively) to the node B, but since there is a pending exchange,
node B ignores the updated map and still thinks that A owns partition 1 [1]
8) transaction attempts to execute an SQL query against partition 1 on node A and retries
infinitely

[1] The related code is in GridDhtPartitionTopologyImpl#update(AffinityTopologyVersion, GridDhtPartitionFullMap,
CachePartitionFullCountersMap, Set, AffinityTopologyVersion)
{code}
if (stopping || !lastTopChangeVer.initialized() ||
    // Ignore message not-related to exchange if exchange is in progress.
    (exchangeVer == null && !lastTopChangeVer.equals(readyTopVer)))
    return false;
{code}

There are two possibilities to fix this:
1) Make all updates to partition map in a single thread, then we will not need update sequences
and then we can update local partition map even when there is a pending exchange (this is
a relatively big, but useful change)
2) Make a change in SQL query execution so that if a node cannot reserve a partition, do not
map the partition to this node on the same topology version anymore (a quick fix)


> Exclude false owners from the execution plan based on query response
> --------------------------------------------------------------------
>
>                 Key: IGNITE-6939
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6939
>             Project: Ignite
>          Issue Type: Task
>      Security Level: Public(Viewable by anyone) 
>            Reporter: Alexey Goncharuk
>
> This is related to IGNITE-6858, the fix in the ticket can be improved.
> The scenario leading to the issue is as follows:
> 1) Node A has partition 1 as owning
> 2) Node B has local partition map which has partition 1 on node A as owning
> 3) Topology change is triggered which would move partition 1 from A to another node,
topology version is X
> 4) A transaction is started on node B on topology X
> 5) Partition is rebalanced and node A moves partition 1 to RENTING and then to EVICTED
state, node A updates it's local partition map.
> 6) A new topology change is triggered
> 7) Node A sends partition map (transitively) to the node B, but since there is a pending
exchange, node B ignores the updated map and still thinks that A owns partition 1 [1]
> 8) transaction attempts to execute an SQL query against partition 1 on node A and retries
infinitely
> [1] The related code is in GridDhtPartitionTopologyImpl#update(AffinityTopologyVersion,
GridDhtPartitionFullMap, CachePartitionFullCountersMap, Set, AffinityTopologyVersion)
> {code}
> if (stopping || !lastTopChangeVer.initialized() ||
>     // Ignore message not-related to exchange if exchange is in progress.
>     (exchangeVer == null && !lastTopChangeVer.equals(readyTopVer)))
>     return false;
> {code}
> There are two possibilities to fix this:
> 1) Make all updates to partition map in a single thread, then we will not need update
sequences and then we can update local partition map even when there is a pending exchange
(this is a relatively big, but useful change)
> 2) Make a change in SQL query execution so that if a node cannot reserve a partition,
do not map the partition to this node on the same topology version anymore (a quick fix)
> This will remove the need to throw an exception from SQL query inside transaction when
there is a pending exchange.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message