ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey Gura (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (IGNITE-2854) Need to implement deadlock detection
Date Mon, 04 Apr 2016 00:17:25 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222229#comment-15222229
] 

Andrey Gura edited comment on IGNITE-2854 at 4/4/16 12:17 AM:
--------------------------------------------------------------

Algorithm described in previous comment has drawbacks. 

It can't detect deadlock for transaction that was timed out and involved into deadlock or
can detect invalid deadlock due to a race conditions. 

For example we have transactions {{TX1}} and {{TX2}} with the same timeout and start time.
{{TX1}} holds lock on key {{K1}} and requests lock for {{K2}} while {{TX2}} hold lock on key
{{K2}} and requests lock for {{K1}} so it is deadlcok. {{K1}} and {{K2}} have different primary
nodes so both transactions are distributed. 

When {{TX1}} and {{TX2}} times out all {{GridDhtColocatedLockFuture}} and blocked {{GridDhtLockFuture}}
times out also. {{GridDhtLockFuture.onTimeout}} initiates deadlock detection while {{GridDhtColocatedLockFuture.onTimeout}}
releases locks and then rollback corresponding transaction. So we have incomplete information
about transactions state and can't detect deadlock or detect something invalid like {{TX1
<-> TX1}}.

The second problem is that in current implementation remote nodes will not send response to
near node in case of {{GridDhtLockFuture}} timeout. So we can't print deadlock information
in user thread.

Suggested solution:

Deadlock detection initiates by near node in case of {{GridDhtColocatedNearFuture.onTimeout}}
invoked. At the same time all {{GridDhtLockFuture}}s register futures in transaction manager.
This futures will be completed when special request about finished detection will be received
from near node.

It is still possible race conditions because for each timed out transaction will be started
concurrent deadlock detection process.



was (Author: agura):
Algorithm described in previous comment has one drawback: it can't detect deadlock for transaction
that was timed out and involved into deadlock or can detect invalid deadlock due to a race
conditions. 

For example we have transactions {{TX1}} and {{TX2}} with the same timeout and start time.
{{TX1}} holds lock on key {{K1}} and requests lock for {{K2}} while {{TX2}} hold lock on key
{{K2}} and requests lock for {{K1}} so it is deadlcok. {{K1}} and {{K2}} have different primary
nodes so both transactions are distributed. 

When {{TX1}} and {{TX2}} times out all {{GridDhtColocatedLockFuture}} and blocked {{GridDhtLockFuture}}
times out also. {{GridDhtLockFuture.onTimeout}} initiates deadlock detection while {{GridDhtColocatedLockFuture.onTimeout}}
releases locks and then rollback corresponding transaction. So we have incomplete information
about transactions state and can't detect deadlock or detect something invalid like {{TX1
<-> TX1}}.

> Need to implement deadlock detection
> ------------------------------------
>
>                 Key: IGNITE-2854
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2854
>             Project: Ignite
>          Issue Type: New Feature
>          Components: cache
>    Affects Versions: 1.5.0.final
>            Reporter: Valentin Kulichenko
>            Assignee: Andrey Gura
>             Fix For: 1.6
>
>
> Currently, if transactional deadlock occurred, there is no easy way to find out which
locks were reordered.
> We need to add a mechanism that will collect information about awating candidates, analyze
it and show guilty keys. Most likely this should be implemented with the help of custom discovery
message.
> In addition we should automatically execute this mechanism if transaction times out and
add information to timeout exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message