cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Smith (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000
Date Fri, 27 Sep 2013 03:41:04 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Christopher Smith updated CASSANDRA-6106:
-----------------------------------------

    Description: 
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra mentioned issues
with millisecond rounding in timestamps and was able to reproduce the issue. If I specify
a timestamp in a mutating query, I get microsecond precision, but if I don't, I get timestamps
rounded to the nearest millisecond, at least for my first query on a given connection, which
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is comprehensive.
I think we probably need a fairly comprehensive replacement of all uses of System.currentTimeMillis()
with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of this patch is
NOT to improve the precision of ordering guarantees for concurrent writes to cells. The purpose
of this patch is to reduce the probability that concurrent writes to cells are deemed as having
occurred at *the same time*, which is when Cassandra violates its atomicity guarantee.

To clarify the failure scenario. Cassandra promises that writes to the same record are "atomic",
so if you do something like:

{quote}
create table foo {
  i int PRIMARY KEY,
  x int,
  y int,
};
{quote}
and then send these two queries concurrently (separate connections, potentially to separate
nodes):

{quote}
insert into foo (i, x, y) values (1, 8, -8);
insert into foo (i, x, y) values (1, -8, 8);
{quote}

you can't be quite sure which of the two writes will be the "last" one, but you do know that
if you do:

{quote}
select x, y from foo where i = 1;
{quote}

you don't know if x is "8" or "-8".
you don't know if y is "-8" or "8".
YOU DO KNOW: x + y will equal 0.

EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, in which case
x + y = 16. :-( Now your writes are not atomic.

  was:
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra mentioned issues
with millisecond rounding in timestamps and was able to reproduce the issue. If I specify
a timestamp in a mutating query, I get microsecond precision, but if I don't, I get timestamps
rounded to the nearest millisecond, at least for my first query on a given connection, which
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is comprehensive.
I think we probably need a fairly comprehensive replacement of all uses of System.currentTimeMillis()
with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of this patch is
NOT to improve the precision of ordering guarantees for concurrent writes to cells. The purpose
of this patch is to reduce the probability that concurrent writes to cells are deemed as having
occurred at *the same time*, which is when Cassandra violates its atomicity guarantee.

To clarify the failure scenario. Cassandra promises that writes to the same record are "atomic",
so if you do something like:

create table foo {
i int PRIMARY KEY,
x int,
y int,
};

and then send these two queries concurrently:

insert into foo (i, x, y) values (1, 8, -8);
insert into foo (i, x, y) values (1, -8, 8);

you can't be quite sure which of the two writes will be the "last" one, but you do know that
if you do:

select x, y from foo where i = 1;

you don't know if x is "8" or "-8".
you don't know if y is "-8" or "8".
YOU DO KNOW: x + y will equal 0.

EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, in which case
x + y = 16. :-( Now your writes are not atomic.

    
> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp
with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6106
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: DSE Cassandra 3.1, but also HEAD
>            Reporter: Christopher Smith
>            Priority: Minor
>              Labels: collision, conflict, timestamp
>         Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra mentioned
issues with millisecond rounding in timestamps and was able to reproduce the issue. If I specify
a timestamp in a mutating query, I get microsecond precision, but if I don't, I get timestamps
rounded to the nearest millisecond, at least for my first query on a given connection, which
substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is comprehensive.
I think we probably need a fairly comprehensive replacement of all uses of System.currentTimeMillis()
with System.nanoTime().
> There seems to be some confusion here, so I'd like to clarify: the purpose of this patch
is NOT to improve the precision of ordering guarantees for concurrent writes to cells. The
purpose of this patch is to reduce the probability that concurrent writes to cells are deemed
as having occurred at *the same time*, which is when Cassandra violates its atomicity guarantee.
> To clarify the failure scenario. Cassandra promises that writes to the same record are
"atomic", so if you do something like:
> {quote}
> create table foo {
>   i int PRIMARY KEY,
>   x int,
>   y int,
> };
> {quote}
> and then send these two queries concurrently (separate connections, potentially to separate
nodes):
> {quote}
> insert into foo (i, x, y) values (1, 8, -8);
> insert into foo (i, x, y) values (1, -8, 8);
> {quote}
> you can't be quite sure which of the two writes will be the "last" one, but you do know
that if you do:
> {quote}
> select x, y from foo where i = 1;
> {quote}
> you don't know if x is "8" or "-8".
> you don't know if y is "-8" or "8".
> YOU DO KNOW: x + y will equal 0.
> EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, in which
case x + y = 16. :-( Now your writes are not atomic.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message