cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out
Date Thu, 17 Sep 2015 17:28:05 GMT


Ariel Weisberg commented on CASSANDRA-7392:

bq. The last two lines of the first paragraph:
Ah right. Well it's true according to docs (and maybe the JMM, not sure it addresses lazySet)
it only eventually sets the value. To guarantee the value has already been set (globally visible)
you need some other operation that blocks flushing the store buffers (which is an incomplete
implementation specific way of describing the type of barrier). Even then that doesn't guarantee
it happens faster (JMM/docs make no such guarantee) it just make guarantees about the ordering
of events.

The docs are pretty scary looking because they are definitely reserving the right to delay
the heck out of things the same way the JMM says that stores aren't visible until threads
synchronize on the same monitor or volatile field. I could concede that according the JMM/doc
it could be bad to rely on lazySet for timely propagation. The JMM is pretty conservatively
specified to provide a minimum of guarantees until they are asked for explicitly (synchronized,
volatile) and that leaves what happens open to how far the compiler can reorder stuff (not
very with the caveat of usually) and how much the CPU can buffer (not much, not for long).
I think it is fine for an approximation like timeouts.

bq. Intel Cache coherence protocol (MESI/MESIF)
I think this is another one of those in practice things. In practice all shared memory CPUs
we have to care about are cache coherent and do something like MESI where you can't read from
an invalidated cache line (change to cache lines propagate immediately). In practice (there
is that word again) the book keeping to make use of invalidated cache lines in a shared memory
system is probably daunting and that's why it isn't done. Your program would have to be explicit
about which loads are safe to do against an invalidated cache line or it would have to be
inferred some how from the memory model of the language.

And that is basically how I arrive at a certain set of assumptions about what eventually means
in lazy set. Also at some point Martin Thompson mentioned talking to an Intel engineer who
said that store buffers always drain as fast as they can. It's probably buried in the mechanical
sympathy google group or his blog.

* [This comment seems out of place?|]
* [What is this change to Slices, is it fixing an NPE? I am guessing you found a few things
making heavier use of toCQLString.|]
* [This is random now so the comment is out of date|]
* If only a fixed number of timed out queries are reported how about only storing references
to the first N? Allocating an ArrayList per query doesn't make much sense if aggregating doesn't
really work yet although it's necessary to count identical queries.
* [Check if debug is enabled before formatting the string and logging?|
* [This is set to 30|]
* I think the utests would be closer to trunk if you rebased. The dtests will get a lot worse

> Abort in-progress queries that time out
> ---------------------------------------
>                 Key: CASSANDRA-7392
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 3.x
> Currently we drop queries that time out before we get to them (because node is overloaded)
but not queries that time out while being processed.  (Particularly common for index queries
on data that shouldn't be indexed.)  Adding the latter and logging when we have to interrupt
one gets us a poor man's "slow query log" for free.

This message was sent by Atlassian JIRA

View raw message