cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7392) Abort in-progress queries that time out
Date Fri, 11 Sep 2015 19:19:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741187#comment-14741187
] 

Ariel Weisberg edited comment on CASSANDRA-7392 at 9/11/15 7:19 PM:
--------------------------------------------------------------------

bq. How would we calculate the rate without also storing the totals? I'm not sure variable
rate logging is the best way to go about it given that we are trying to achieve a poor man's
"slow query log" for free. The issue is how to avoid polluting log files, so the effort required
to support variable rate logging would perhaps be better spent logging the timed-out queries
elsewhere?
For the rate you don't need totals you just need to know the count of timeouts and what period
of time that count covers. When trying to keep the log frequency down I was thinking of periods
of minutes or hours where one node is timing stuff out.

You are right we could create a slow query log file, create a logger for it and configure
it via logback. It would then roll over separately from the other logs. I think that is good
idea. I think it also means you can be more chatty and raise the threshold for how many queries
you log about each time and what kind of details you can log about them.

If we are doing a dedicated slow query log maybe that means we want to make the format easily
parseable so it's not hard to write tools to analyze/visualize. as well. Old school databases
like MySQL and Postgres have multiple log files for different purposes (auth, system, slow
query) so it's not an unprecedented direction to move in. We should get buy in first though
since someone might make the case that they don't want a bunch of separate log files.

bq. Sounds good, done. Enforcing a minimum of 50 milliseconds however slows down the unit
tests a bit, since it gets a bit messy to override the minimum as well. The trouble is that
the singleton is submitted for scheduling before we can change any class field. I could move
the properties to another class to make it a bit cleaner.
Is it just a few seconds? I would be ok with a handful of seconds.

New thoughts
* I think other slow query logs can include the full parameters of the query. Maybe we want
to consider logging the fully formatted CQL statement or having that option. You can end up
with a query that has a reasonable execution plan, but the bound parameters involved make
it slow. Maybe we log the query up to a maximum size by default and people can increase it
via a property.
* For the properties, there is a constant for the "cassandra." in DatabaseDescriptor or Config
(I can't recall)


was (Author: aweisberg):
bq. How would we calculate the rate without also storing the totals? I'm not sure variable
rate logging is the best way to go about it given that we are trying to achieve a poor man's
"slow query log" for free. The issue is how to avoid polluting log files, so the effort required
to support variable rate logging would perhaps be better spent logging the timed-out queries
elsewhere?
For the rate you don't need totals you just need to know the count of timeouts and what period
of time that count covers. When trying to keep the log frequency down I was thinking of periods
of minutes or hours where one node is timing stuff out.

You are right we could create a slow query log file, create a logger for it and configure
it via logback. It would then roll over separately from the other logs. I think that is good
idea. I think it also means you can be more chatty and raise the threshold for how many queries
you log about each time and what kind of details you can log about them.

If we are doing a dedicated slow query log maybe that means we want to make the format easily
parseable so it's not hard to write tools to analyze/visualize. as well. Old school databases
like MySQL and Postgres have multiple log files for different purposes (auth, system, slow
query) so it's not an unprecedented direction to move in. We should get buy in first though
since someone might make the case that they don't want a bunch of separate log files.

bq. Sounds good, done. Enforcing a minimum of 50 milliseconds however slows down the unit
tests a bit, since it gets a bit messy to override the minimum as well. The trouble is that
the singleton is submitted for scheduling before we can change any class field. I could move
the properties to another class to make it a bit cleaner.
Is it just a few seconds? I would be ok with a handful of seconds.

New thoughts
* I think other slow query logs can include the full parameters of the query. Maybe we want
to consider logging the fully formatted CQL statement or having that option. You can end up
with a query that has a reasonable execution plan, but the bound parameters involved make
it slow. Maybe we log the query up to a maximum size by default and people can increase it
via a property.

> Abort in-progress queries that time out
> ---------------------------------------
>
>                 Key: CASSANDRA-7392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node is overloaded)
but not queries that time out while being processed.  (Particularly common for index queries
on data that shouldn't be indexed.)  Adding the latter and logging when we have to interrupt
one gets us a poor man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message