cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Whiteside (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9328) WriteTimeoutException thrown when LWT concurrency > 1, despite the query duration taking MUCH less than cas_contention_timeout_in_ms
Date Fri, 06 Nov 2015 00:36:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992789#comment-14992789
] 

Aaron Whiteside commented on CASSANDRA-9328:
--------------------------------------------

Using a version id (to execute the conditional update on) and a transaction id (to determine
if a WTE that really succeeded, was applied by the current thread/transaction/operation) still
does not work.

Thread A: reads version 1
Thread A: updates version 1 to 2, transaction id to ABC, and sets account balance to $0+$100=$100,
but receives a WTE.
Thread B: reads version 2
Thread B: updates version 2 to 3, transaction id to XYZ, and sets account balance to $100+500=$600,
win the race, no WTEs anywhere in sight.
Thread B: is happy!
Thread A: tries again, reads version 3 this time, sees that version 3 is greater than it's
previous version 2, now it checks the transaction id and finds it's also different.. 

How can thread A know that it's update failed or succeeded? since between it doing the update
and reading the record again, someone else has updated it.

At this point thread A might assume it failed and try again and add another $100 to the balance,
causing more money to appear in the account than would be expected. Or it might choose to
abandon the transaction, but if the WTE was actually due to a timeout and not contention the
balance will have $100 less then is expected.

And no one is happy.

> WriteTimeoutException thrown when LWT concurrency > 1, despite the query duration
taking MUCH less than cas_contention_timeout_in_ms
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9328
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9328
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination
>            Reporter: Aaron Whiteside
>             Fix For: 2.1.x
>
>         Attachments: CassandraLWTTest.java, CassandraLWTTest2.java
>
>
> WriteTimeoutException thrown when LWT concurrency > 1, despite the query duration
taking MUCH less than cas_contention_timeout_in_ms.
> Unit test attached, run against a 3 node cluster running 2.1.5.
> If you reduce the threadCount to 1, you never see a WriteTimeoutException. If the WTE
is due to not being able to communicate with other nodes, why does the concurrency >1 cause
inter-node communication to fail?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message