ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Menshikov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (IGNITE-4908) Ignite.reentrantLock looks much slower than IgniteCache.lock.
Date Tue, 03 Oct 2017 15:04:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189813#comment-16189813
] 

Alexander Menshikov edited comment on IGNITE-4908 at 10/3/17 3:03 PM:
----------------------------------------------------------------------

[~amashenkov] [~avinogradov]
In IgniteCache.lock on messaging side all nodes which try acquire a lock are sending GridNearLockRequest
to the coordinator. After that coordinator sends GridNearLockResponse to one of the nodes
which becomes a lock owner. I'm not sure but looks like that process unfair because my logs
show that not the first node which sent the request became first lock owner. After that lock
owner sends a GridNearUnlockRequest. And coordinator sends the response to the next node in
a waiting list.

One call of IgniteCache#invoke() sends 2 messages: GridNearAtomicSingleUpdateInvokeRequest
and GridNearAtomicUpdateResponse.

So IgniteCache.lock implementation needs 3 message on one lock+unlock (GridNearLockRequest,
GridNearLockResponse, and GridNearUnlockRequest), but my implementation based on IgniteCache#invoke
need 5 message: 2 message for try lock, 1 a release message in case lock was failed (new message
which I introduce for task), and yet 2 message for release.

*It happens because IgniteCache.lock implementation use a blocking operation where the response
will be sent only when a node will become lock owner. Unlike IgniteCache#invoke which hasn't
an ability to waiting of special value in the cache.*

I propose to move forward with my implementation. I need only add a failover.
It gave us *17x speed up on 10 nodes cluster* on lock operation right now.
 
I will create an additional task and describe implementation details of perfect lock implementation
also.
 
Thoughts?


was (Author: sharpler):
[~amashenkov] [~avinogradov]
In IgniteCache.lock on messaging side all nodes which try acquire a lock are sending GridNearLockRequest
to the coordinator. After that coordinator sends GridNearLockResponse to one of the nodes
which becomes a lock owner. I'm not sure but looks like that process unfair because my logs
show that not the first node which sent the request became first lock owner. After that lock
owner sends a GridNearUnlockRequest. And coordinator sends the response to the next node in
a waiting list.

One call of IgniteCache#invoke() sends 2 messages: GridNearAtomicSingleUpdateInvokeRequest
and GridNearAtomicUpdateResponse.

So IgniteCache.lock implementation needs 3 message on one lock+unlock (GridNearLockRequest,
GridNearLockResponse, and GridNearUnlockRequest), but my implementation based on IgniteCache#invoke
need 5 message: 2 message for try lock, 1 a release message in case lock was failed (new message
which I introduce for task), and yet 2 message for release.

*It happens because IgniteCache.lock implementation use a blocking operation where the response
will be sent only when a node will become lock owner. Unlike IgniteCache#invoke which hasn't
an ability to waiting of special value in the cache.*

I propose to move forward with my implementation. I need only add a failover.
It gave us *17x speed up on cluster consist of 10 nodes* on lock operation right now.
 
I will create an additional task and describe implementation details of perfect lock implementation
also.
 
Thoughts?

> Ignite.reentrantLock looks much slower than IgniteCache.lock.
> -------------------------------------------------------------
>
>                 Key: IGNITE-4908
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4908
>             Project: Ignite
>          Issue Type: Improvement
>          Components: data structures
>    Affects Versions: 1.8
>            Reporter: Andrew Mashenkov
>            Assignee: Alexander Menshikov
>
> Design discussed with Alexander:
> 1) Lock 
> Entry Processor (sync) -> 
> ....add candidate. 
> ....returns "added candidate at first position"
> ....retry failover -> 
> ........if already at first position -> return true
> In case lock not acquired, wait for acquire (AbstractQueuedSynchronizer should be used).
> 2) Unlock 
> Entry Processor (async) -> 
> ....remove candidate at first position
> ....retry failover -> remove only if "candidate at first position" equals to expected
> ....listener ->
> ........notify current "candidate at first position" it got lock
> 3)Failover
> 3.1) Originating node failed
> Failed node listener ->
> ....For each local(primary) lock ->
> ........Entry Processor (async) ->
> ............remove candidates related no failed node
> ............retry failover not needed
> ............listener -> 
> ................if "candidate at first position" removed ->
> ....................notify current "candidate at first position" it got lock
> 3.2) Primary node failed
> After rebalancing schedule Callable ->
> ....For each local(primary) lock ->
> ........Entry Processor (async) ->
> ............remove candidates related to failed nodes
> ............retry failover not needed
> ............listener -> 
> ................notify current "candidate at first position" it got lock



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message