cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out
Date Thu, 10 Sep 2015 09:37:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738498#comment-14738498
] 

Stefania commented on CASSANDRA-7392:
-------------------------------------

bq. I don't think the check is necessary, lazySet is sufficient

Thanks, replaced.

bq. Don't reset MonitorableThreadLocal, that requires retrieving the threadlocal, and then
storing a value in the AR (which would be cheap if lazy set). Set the state associated with
the task to "done" and let the monitoring thread ignore it. Ideally you only have to get the
threadlocal once per request.

Ok, I removed the reset. The monitoring thread is already checking only operations that are
in progress.

bq. When doing state transitions on the monitoring ref I don't think you need CAS. You can
read and then lazySet and there is a small chance that the logging will think something was
dropped when it wasn't. The primary functionality of shedding still works and that is always
an approximation.

Ok, but does a lazySet() really buy a lot compared to a CAS? I mean we are introducing a race,
albeit with no consequences at the moment.

bq. How does creating a monitoring state ref not stored anywhere work?. Is this for things
that read, but aren't actually read operations? I looked up the call chain and that appears
to be the case.

Yes read commands are used internally to do things that are not read queries from users so
I don't think we should abort these. Actually I changed the call chain for {{SinglePartitionReadCommand.queryMemtableAndDisk()}}
so that we pass down the {{readOp}} in an otherwise empty {{executionController}}. 

bq. Can you make these properties? I would also say that 10 milliseconds is closer to more
often (of course I am hand waving). If it's necessary to check that often for precision/timeliness
then sure. Not sure what cross node timeouts are generally set at. It will probably be fine
either way.

Properties created. Should they be user visible in the yaml too? What about {{MessagingService.LOG_DROPPED_INTERVAL_IN_MS}},
should this be replaced with the property we just created or shall we leave it alone? Basically
these are messages that weren't even processed, as opposite to started and then timed out.
As for a default value for checking, the default read timeout is 5 seconds, so let's use 1%
of this, 50 milli seconds?

bq. So here is a pony. One thing I am leery of is overly chatty relative to its utility logging
and rolling logs. Reporting on N different timed out queries every five seconds is going to
be too much and cause log rolling pushing out other relevant statements. So the pony is to
sort them by # of timeouts, only display N=5, and aggregate them all into a single log statement.
If there are more than five, add an ellipsis indicating some were dropped, along with the
#. Then add a property allowing the maximum number to be displayed to be configured so people
can get all of them on the outside chance it is helpful.

Done, we log only the top 5 operations with the highest number of timeouts, where 5 is a configurable
property. For each operation we log the CQL string, how many times it timed out, the interval
and the details of the first and last timeout.

bq. The next pony is backoff. In a bad situation where we have elected to report the last
time we ran, we should start reporting less frequently up to a point. This is functionality
that can be factored out usefully IMO for other kinds of logging/reporting. And then people
can have the option of turning backoff off if it's not what they want (or load the backoff
strategy by reflection). But like I said, ponies. The important thing is to not spam the log,
causing rolling, and obscure other log info.

Something like {{NoSpanLogger}} except it needs to cache similar log messages rather than
identical messages and it should be configurable? I have run out of time today so I haven't
thought about this much so far.

bq. MonitoringTask doesn't test the minor, but important case, if there are no failed operations
does it refrain from logging? So < timeout, or abort() returns false.

Done with abort returning false for all operations.

bq. This kind of thing should have a Thread.yield() in it. We run several unit tests in parallel
and if tests spin they can interfere with other tests. Same for similar loops.

Done, thanks.

bq. Reduce report delay millis for test? Wall clock time spent sleeping extends the total
time that tests run. Or maybe just poke the task manually so that you don't have to wait at
all? Reducing the timeout is problematic since it leads to false positives. If there is a
way to get a free lunch with the test running faster it would be good.

Done, we extract the failed operations directly now. I disabled reporting except for one test,
just to exercise the code path.

> Abort in-progress queries that time out
> ---------------------------------------
>
>                 Key: CASSANDRA-7392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node is overloaded)
but not queries that time out while being processed.  (Particularly common for index queries
on data that shouldn't be indexed.)  Adding the latter and logging when we have to interrupt
one gets us a poor man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message