cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9649) Paxos ballot in StorageProxy could clash
Date Thu, 25 Jun 2015 11:16:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601016#comment-14601016
] 

Sylvain Lebresne commented on CASSANDRA-9649:
---------------------------------------------

I believe you're right. We need ballots to be globally unique, which given we provide our
own timestamp to {{UUIDGen.getTimeUUID()}} means that we must ensure said timestamp is locally
unique. And that has been broken by CASSANDRA-5667 (before that we were letting {{UUIDGen}}
pick the timestamp which does the right thing).

And dividing {{state.getTimestamp()}} is only one of the problems, we have 2 others: the fact
that {{state.getTimestamp()}} is currently only unique to a connection not a VM, and the fact
that 2 racy operations that do get a {{summary}} might see the same {{mostRecentInProgressCommit}}
and thus pick the same timestamp.

So anyway, I've pushed [here|https://github.com/pcmanus/cassandra/commits/9649] a commit to
fix that. I'm also going to bump the priority since unless I'm forgetting a reason why this
is not a problem, it's kind of a big deal.

I do want to note one point: due to the conjonction of CASSANDRA-7801 and CASSANDRA-5667,
if a node has its clock in the future, other nodes might "synchronize" themselves on that
clock through Paxos operations. While this is probably ok for small drift (this may even be
considered a good thing), this could be rather problematic if a node ends up with a clock
far in the future (even temporarily, due to an operator error for instance). This is a risk
for our Paxos by design, but CASSANDRA-7801 makes the consequences potentially affect other
operation as well. This makes me nervous and while I still think CASSANDRA-7801 is theoreticaly
a good idea, I'm starting to think that it might not be worth the risk and so I wanted to
float the idea of reversing it. I've pushed a branch [here|https://github.com/pcmanus/cassandra/commits/9649-without-7801]
with an additional (simple) commit to show what I have in mind.

The patch is against 2.1. C* 2.0 is affected too but the patch won't apply because it builds
on top of CASSANDRA-7801 which isn't on 2.0. I'm a bit reluctant to mess too much with 2.0
at this point so I would suggest to basically revert CASSANDRA-5667 by simply relying on {{UUIGen.getTimeUUID()}}
without providing our own timestamp.


> Paxos ballot in StorageProxy could clash
> ----------------------------------------
>
>                 Key: CASSANDRA-9649
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9649
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefania
>            Assignee: Stefania
>            Priority: Minor
>
> This code in {{StorageProxy.beginAndRepairPaxos()}} takes a timestamp in microseconds
but divides it by 1000 before adding one. So if the summary is null, ballotMillis would be
the same for up to 1000 possible state timestamp values:
> {code}
>     long currentTime = (state.getTimestamp() / 1000) + 1;
>     long ballotMillis = summary == null
>                                  ? currentTime
>                                  : Math.max(currentTime, 1 +    UUIDGen.unixTimestamp(summary.mostRecentInProgressCommit.ballot));
>     UUID ballot = UUIDGen.getTimeUUID(ballotMillis);
> {code}
> {{state.getTimestamp()}} returns the time in micro seconds and it ensures to add one
microsecond to any previously used timestamp if the client sends the same or an older timestamp.

> Initially I used this code in {{ModificationStatement.casInternal()}}, introduced by
CASSANDRA-9160 to support cas unit tests, but occasionally these tests were failing. It was
only when I ensured uniqueness of the ballot that the tests started to pass reliably.
> I wonder if we could ever have the same issue in StorageProxy?
> cc [~jbellis] and [~slebresne] for CASSANDRA-7801



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message