cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8672) Ambiguous WriteTimeoutException while completing pending CAS commits
Date Wed, 01 Apr 2015 19:52:53 GMT


Sylvain Lebresne commented on CASSANDRA-8672:

Apologies, for some reason I didn't understood you were talking of the commit that is within
{{beginAndRepairPaxos}}. So I agree, that commit should definitively throw {{WriteType.CAS}}
(I think it's an oversight that it doesn't), and in fact, the patches look good to me so I've
committed them. I'll therefore close this issue. If someone wants to discuss further splitting
{{WriteType.CAS}} into more precise values (since that end up being suggested even though
that's orthogonal to the issue fixed), feel free to open a separate issue (but as I mentioned
above, I'm personally not entirely convinced that it's entirely wise).

> Ambiguous WriteTimeoutException while completing pending CAS commits
> --------------------------------------------------------------------
>                 Key: CASSANDRA-8672
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Stefan Podkowinski
>            Assignee: Tyler Hobbs
>            Priority: Minor
>              Labels: CAS
>             Fix For: 3.0
>         Attachments: storyproxybean-2.0-8672.txt, storyproxybean-2.1-8672.txt
> Any CAS update has a chance to trigger a pending/stalled commit of any previously agreed
on CAS update. After completing the pending commit, the CAS operation will resume to execute
the actual update and also possibly create a new commit. See StorageProxy.cas()
> Theres two possbile execution paths that might end up throwing a WriteTimeoutException:
> cas() -> beginAndRepairPaxos() -> commitPaxos()
> cas() -> commitPaxos()
> Unfortunatelly clients catching a WriteTimeoutException won't be able to tell at which
stage the commit failed. My guess would be that most developers are not aware that the beginAndRepairPaxos()
could also trigger a write and assume that write timeouts would refer to a timeout while writting
the actual CAS update. Its therefor not safe to assume that successive CAS or SERIAL read
operations will cause a (write-)timeouted CAS operation to get eventually applied. Although
some [best-practices advise|]
claims otherwise.
> At this point the safest bet is possibly to retry the complete business transaction in
case of an WriteTimeoutException. However, as theres a chance that the timeout occurred while
writing the actual CAS operation, another write could potentially complete it and our CAS
condition will get a different result upon retry.

This message was sent by Atlassian JIRA

View raw message