ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Rakov <ivan.glu...@gmail.com>
Subject Re: Tx lock partial happens before
Date Mon, 15 Jul 2019 20:49:42 GMT
Anton,

> Step-by-step:
> 1) primary locked on key mention (get/put) at pessimistic/!read-committed tx
> 2) backups locked on prepare
> 3) primary unlocked on finish
> 4) backups unlocked on finish (after the primary)
> correct?
Yes, this corresponds to my understanding of transactions protocol. With 
minor exception: steps 3 and 4 are inverted in case of one-phase commit.

> Agree, but seems there is no need to acquire the lock, we have just to wait
> until entry becomes unlocked.
> - entry locked means that previous tx's "finish" phase is in progress
> - entry unlocked means reading value is up-to-date (previous "finish" phase
> finished)
> correct?
Diving deeper, entry is locked if its GridCacheMapEntry.localCandidates 
queue is not empty (first item in queue is actually the transaction that 
owns lock).

> we have just to wait
> until entry becomes unlocked.
This may work.
If consistency checking code has acquired lock on primary, backup can be 
in two states:
- not locked - and new locks won't appear as we are holding lock on primary
- still locked by transaction that owned lock on primary just before our 
checking code - in such case checking code should just wait for lock release

Best Regards,
Ivan Rakov

On 15.07.2019 9:34, Anton Vinogradov wrote:
> Ivan R.
>
> Thanks for joining!
>
> Got an idea, but not sure that got a way of a fix.
>
> AFAIK (can be wrong, please correct if necessary), at 2PC, locks are
> acquired on backups during the "prepare" phase and released at "finish"
> phase after primary fully committed.
> Step-by-step:
> 1) primary locked on key mention (get/put) at pessimistic/!read-committed tx
> 2) backups locked on prepare
> 3) primary unlocked on finish
> 4) backups unlocked on finish (after the primary)
> correct?
>
> So, acquiring locks on backups, not at the "prepare" phase, may cause
> unexpected behavior in case of primary fail or other errors.
> That's definitely possible to update failover to solve this issue, but it
> seems to be an overcomplicated way.
> The main question there, it there any simple way?
>
>>> checking read from backup will just wait for commit if it's in progress.
> Agree, but seems there is no need to acquire the lock, we have just to wait
> until entry becomes unlocked.
> - entry locked means that previous tx's "finish" phase is in progress
> - entry unlocked means reading value is up-to-date (previous "finish" phase
> finished)
> correct?
>
> On Mon, Jul 15, 2019 at 8:37 AM Павлухин Иван <vololo100@gmail.com>
wrote:
>
>> Anton,
>>
>> I did not know mechanics locking entries on backups during prepare
>> phase. Thank you for pointing that out!
>>
>> пт, 12 июл. 2019 г. в 22:45, Ivan Rakov <ivan.glukos@gmail.com>:
>>> Hi Anton,
>>>
>>>> Each get method now checks the consistency.
>>>> Check means:
>>>> 1) tx lock acquired on primary
>>>> 2) gained data from each owner (primary and backups)
>>>> 3) data compared
>>> Did you consider acquiring locks on backups as well during your check,
>>> just like 2PC prepare does?
>>> If there's HB between steps 1 (lock primary) and 2 (update primary +
>>> lock backup + update backup), you may be sure that there will be no
>>> false-positive results and no deadlocks as well. Protocol won't be
>>> complicated: checking read from backup will just wait for commit if it's
>>> in progress.
>>>
>>> Best Regards,
>>> Ivan Rakov
>>>
>>> On 12.07.2019 9:47, Anton Vinogradov wrote:
>>>> Igniters,
>>>>
>>>> Let me explain problem in detail.
>>>> Read Repair at pessimistic tx (locks acquired on primary, full sync,
>> 2pc)
>>>> able to see consistency violation because backups are not updated yet.
>>>> This seems to be not a good idea to "fix" code to unlock primary only
>> when
>>>> backups updated, this definitely will cause a performance drop.
>>>> Currently, there is no explicit sync feature allows waiting for backups
>>>> updated during the previous tx.
>>>> Previous tx just sends GridNearTxFinishResponse to the originating
>> node.
>>>> Bad ideas how to handle this:
>>>> - retry some times (still possible to gain false positive)
>>>> - lock tx entry on backups (will definitely break failover logic)
>>>> - wait for same entry version on backups during some timeout (will
>> require
>>>> huge changes at "get" logic and false positive still possible)
>>>>
>>>> Is there any simple fix for this issue?
>>>> Thanks for tips in advance.
>>>>
>>>> Ivan,
>>>> thanks for your interest
>>>>
>>>>>> 4. Very fast and lucky txB writes a value 2 for the key on primary
>> and
>>>> backup.
>>>> AFAIK, reordering not possible since backups "prepared" before primary
>>>> releases lock.
>>>> So, consistency guaranteed by failover and by "prepare" feature of 2PC.
>>>> Seems, the problem is NOT with consistency at AI, but with consistency
>>>> detection implementation (RR) and possible "false positive" results.
>>>> BTW, checked 1PC case (only one data node at test) and gained no
>> issues.
>>>> On Fri, Jul 12, 2019 at 9:26 AM Павлухин Иван <vololo100@gmail.com>
>> wrote:
>>>>> Anton,
>>>>>
>>>>> Is such behavior observed for 2PC or for 1PC optimization? Does not it
>>>>> mean that the things can be even worse and an inconsistent write is
>>>>> possible on a backup? E.g. in scenario:
>>>>> 1. txA writes a value 1 for the key on primary.
>>>>> 2. txA unlocks the key on primary.
>>>>> 3. txA freezes before updating backup.
>>>>> 4. Very fast and lucky txB writes a value 2 for the key on primary and
>>>>> backup.
>>>>> 5. txB wakes up and writes 1 for the key.
>>>>> 6. As result there is 2 on primary and 1 on backup.
>>>>>
>>>>> Naively it seems that locks should be released after all replicas are
>>>>> updated.
>>>>>
>>>>> ср, 10 июл. 2019 г. в 16:36, Anton Vinogradov <av@apache.org>:
>>>>>> Folks,
>>>>>>
>>>>>> Investigating now unexpected repairs [1] in case of ReadRepair usage
>> at
>>>>>> testAccountTxNodeRestart.
>>>>>> Updated [2] the test to check is there any repairs happen.
>>>>>> Test's name now is "testAccountTxNodeRestartWithReadRepair".
>>>>>>
>>>>>> Each get method now checks the consistency.
>>>>>> Check means:
>>>>>> 1) tx lock acquired on primary
>>>>>> 2) gained data from each owner (primary and backups)
>>>>>> 3) data compared
>>>>>>
>>>>>> Sometime, backup may have obsolete value during such check.
>>>>>>
>>>>>> Seems, this happen because tx commit on primary going in the
>> following
>>>>> way
>>>>>> (check code [2] for details):
>>>>>> 1) performing localFinish (releases tx lock)
>>>>>> 2) performing dhtFinish (commits on backups)
>>>>>> 3) transferring control back to the caller
>>>>>>
>>>>>> So, seems, the problem here is that "tx lock released on primary"
>> does
>>>>> not
>>>>>> mean that backups updated, but "commit() method finished at caller's
>>>>>> thread" does.
>>>>>> This means that, currently, there is no happens-before between
>>>>>> 1) thread 1 committed data on primary and tx lock can be reobtained
>>>>>> 2) thread 2 reads from backup
>>>>>> but still strong HB between "commit() finished" and "backup updated"
>>>>>>
>>>>>> So, it seems to be possible, for example, to gain notification by
a
>>>>>> continuous query, then read from backup and gain obsolete value.
>>>>>>
>>>>>> Is this "partial happens before" behavior expected?
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-11973
>>>>>> [2] https://github.com/apache/ignite/pull/6679/files
>>>>>> [3]
>>>>>>
>> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal#finishTx
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Ivan Pavlukhin
>>>>>
>>
>>
>> --
>> Best regards,
>> Ivan Pavlukhin
>>

Mime
View raw message