cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7392) Abort in-progress queries that time out
Date Thu, 10 Sep 2015 17:37:47 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739184#comment-14739184
] 

Ariel Weisberg edited comment on CASSANDRA-7392 at 9/10/15 5:37 PM:
--------------------------------------------------------------------

bq. Ok, but does a lazySet() really buy a lot compared to a CAS? I mean we are introducing
a race, albeit with no consequences at the moment.
Truthfully not much. This is a CAS once per operation on a non-contended memory location that
is hopefully in a nearby cache. Things that only occur once per operation aren't going to
be super high yield. The biggest benefits come from the tight loops where the thing you made
faster/eliminated occurs N/lg N/N lg N times per operation.

You do get to eliminate a CAS loop. So it's not just a performance benefit. I don't feel strongly
about it either way since this is a once per operation optimization.

The CAS is going to stall execution while the CPU waits for the store buffers to flush to
L1. Lazy set on x86 does not cause a stall it just prevents instruction reordering by the
compiler. x86 is a TSO architecture so stores to shared memory are never reordered from program
order so the CPU does no additional work compared to what it always does.

Waiting for the store buffers to drain is maybe a minimum of 10-15 nanoseconds. I have never
seen measurements of store buffer flushing time so I can't give an exact answer. A fast operation
is going to be 10s of microseconds. In C* I don't know how long they take could be hundreds
it really depends on the operation. We are talking a fraction of a % of total operation time
and more importantly there is no impact on scalability.

-I have some knowledge gaps as to what the store buffer can/can't do in practice so I can't
be very accurate-
Another thing I didn't mention until I got confirmation from Benedict. A word aligned blind
store (write to a field without reading the value) can be performed asynchronously by the
store buffer even if the cache line is not resident. So in the best case you eliminate a stall
to L2/L3/Main memory.

bq. Properties created. Should they be user visible in the yaml too? What about MessagingService.LOG_DROPPED_INTERVAL_IN_MS,
should this be replaced with the property we just created or shall we leave it alone? Basically
these are messages that weren't even processed, as opposite to started and then timed out.
As for a default value for checking, the default read timeout is 5 seconds, so let's use 1%
of this, 50 milli seconds?
I see options in the YAML as a bad thing and I also don't like undocumented options. I also
fear shipping with hard coded constants because it means there is no option other than recompiling
in the field.

I see properties as bridging the gap between the next release where we either fix the implementation
so tuning the defaults isn't necessary or make it an option in the YAML if we can't make it
just work without operator intervention.

WRT to the default. I would say 1% of the timeout is pretty high precision. You do have some
insight there thinking about the frequency as a % of the timeout. I would say go with that
and set the check frequency as a % of the timeout unless overridden by a property, but also
set a minimum value. I think 50 milliseconds is good as a minimum. I would say 10% is good
enough so we would be off by around 1/10th of the timeout, but I don't feel strongly.

bq. What about MessagingService.LOG_DROPPED_INTERVAL_IN_MS, should this be replaced with the
property we just created or shall we leave it alone? 
That log message is fixed size (% of verbs) and covers all dropped messages and not just the
subset of timeouts you are working on right now. I would say leave it alone just by virtue
of being out of scope. It might benefit from backoff.

bq. Something like NoSpanLogger except it needs to cache similar log messages rather than
identical messages and it should be configurable? I have run out of time today so I haven't
thought about this much so far.
I think that's it. NoSpamLogger doesn't allow you to specify a policy for backoff as opposed
to fixed intervals. I think that is a missing capability. Have it support a policy, and then
tell you whether it is  time to log so you can decide whether to clear the stats.

One caveat that occurs to me of logging this kind of thing at a variable rate is that absolute
counts of events are no longer informative. You need to log a rate so you can compare without
having to do your own math. Even then there is some harm because you could grow the reporting
interval to include time where nothing is going wrong distorting the reported rate. I think
there is some tension between my pony and providing precise data. What do you think?


was (Author: aweisberg):
bq. Ok, but does a lazySet() really buy a lot compared to a CAS? I mean we are introducing
a race, albeit with no consequences at the moment.
Truthfully not much. This is a CAS once per operation on a non-contended memory location that
is hopefully in a nearby cache. Things that only occur once per operation aren't going to
be super high yield. The biggest benefits come from the tight loops where the thing you made
faster/eliminated occurs N/lg N/N lg N times per operation.

You do get to eliminate a CAS loop. So it's not just a performance benefit.

The CAS is going to stall execution while the CPU waits for the store buffers to flush to
L1. Lazy set on x86 does not cause a stall it just prevents instruction reordering by the
compiler. x86 is a TSO architecture so stores to shared memory are never reordered from program
order so the CPU does no additional work compared to what it always does.

Waiting for the store buffers to drain is maybe a minimum of 10-15 nanoseconds. I have never
seen measurements of store buffer flushing time so I can't give an exact answer. A fast operation
is going to be 10s of microseconds. In C* I don't know how long they take could be hundreds
it really depends on the operation. We are talking a fraction of a % of total operation time
and more importantly there is no impact on scalability.

I have some knowledge gaps as to what the store buffer can/can't do in practice so I can't
be very accurate

bq. Properties created. Should they be user visible in the yaml too? What about MessagingService.LOG_DROPPED_INTERVAL_IN_MS,
should this be replaced with the property we just created or shall we leave it alone? Basically
these are messages that weren't even processed, as opposite to started and then timed out.
As for a default value for checking, the default read timeout is 5 seconds, so let's use 1%
of this, 50 milli seconds?
I see options in the YAML as a bad thing and I also don't like undocumented options. I also
fear shipping with hard coded constants because it means there is no option other than recompiling
in the field.

I see properties as bridging the gap between the next release where we either fix the implementation
so tuning the defaults isn't necessary or make it an option in the YAML if we can't make it
just work without operator intervention.

WRT to the default. I would say 1% of the timeout is pretty high precision. You do have some
insight there thinking about the frequency as a % of the timeout. I would say go with that
and set the check frequency as a % of the timeout unless overridden by a property, but also
set a minimum value. I think 50 milliseconds is good as a minimum. I would say 10% is good
enough so we would be off by around 1/10th of the timeout, but I don't feel strongly.

bq. What about MessagingService.LOG_DROPPED_INTERVAL_IN_MS, should this be replaced with the
property we just created or shall we leave it alone? 
That log message is fixed size (% of verbs) and covers all dropped messages and not just the
subset of timeouts you are working on right now. I would say leave it alone just by virtue
of being out of scope. It might benefit from backoff.

bq. Something like NoSpanLogger except it needs to cache similar log messages rather than
identical messages and it should be configurable? I have run out of time today so I haven't
thought about this much so far.
I think that's it. NoSpamLogger doesn't allow you to specify a policy for backoff as opposed
to fixed intervals. I think that is a missing capability. Have it support a policy, and then
tell you whether it is  time to log so you can decide whether to clear the stats.

One caveat that occurs to me of logging this kind of thing at a variable rate is that absolute
counts of events are no longer informative. You need to log a rate so you can compare without
having to do your own math. Even then there is some harm because you could grow the reporting
interval to include time where nothing is going wrong distorting the reported rate. I think
there is some tension between my pony and providing precise data. What do you think?

> Abort in-progress queries that time out
> ---------------------------------------
>
>                 Key: CASSANDRA-7392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node is overloaded)
but not queries that time out while being processed.  (Particularly common for index queries
on data that shouldn't be indexed.)  Adding the latter and logging when we have to interrupt
one gets us a poor man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message