cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7392) Abort in-progress queries that time out
Date Mon, 27 Jul 2015 18:44:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643077#comment-14643077
] 

Ariel Weisberg edited comment on CASSANDRA-7392 at 7/27/15 6:43 PM:
--------------------------------------------------------------------

Not done reviewing. Just wanted to drop a thought on design/performance. Right now it schedules
a monitoring task for every operation. This has a cost linear to the number of operations
and adds a contended queue with a not great queue implementation (priority queue). My wild
guess is that there may be a workload we care about that will highlight this. If we do our
jobs it should bottleneck eventually as it is shared state accessed for every single operation.

What I would suggest is inverting the process. Have a thread wake up periodically and walk
the list of in progress operations looking for ones that need to be timed out. You can store
the in progress operation in a thread local. When the thread local is created it inserts into
a COW array list that the monitoring thread can iterate. That way there is no mutation of
shared state and it only enters shared state periodically when the the monitoring thread walks
the list.

This alternative approach has a cost linear to the number of threads dispatching tasks and
almost no coherence cost.

Just to validate my understanding, when a query is aborted it either sends no response or
sends an error response? I just want to be sure that it doesn't silently claim there is no
more data.

The logging in OpMonitor isn't rate limited or aggregating. One super fancy thing to do would
be to count all the timed out operations for the same name operation and report them periodically
as a count. It's not urgent, but finding a way to report on what happened that is less verbose,
but still useful would be good. Reporting on how long it took the timeout to fire is useful
because it can give you an idea of how timely the expiration thread is which can hint at external
pauses.

ReadCommandVerbHandler constructs an OpState with a string that is the CQL command. It would
be nice not to have to do any string construction there and if we did maybe try and cache
it somehow.

In DroppableRunnable you switch from System.currentTimeMillis() to nanoTime(). I always looking
for opportunities to avoid getting the current time (get it once and pass it around). We grab
the time once when reading a message so theoretically if there has been no blocking you don't
need to grab it again if we had ponies.

WRT to OpState. Use unsafe instead of AtomicReferenceFieldUpdater. ATRFU is not super fast
last I heard. If you are in the business of looking to avoid extra objects/indirection is
it possible to stuff the fields inside an existing object that is present everywhere OpState
is already used? mixins where art thou.

The OpMonitoring tests don't wait long enough. I would write those to sleep, and then spin
for a looong time checking to see if the cancellation took place. The reason I write this
way is that I don't want a huge hiccup on the test node to randomly throw up a failing test.
It will run fast most of the time since it is spinning.


was (Author: aweisberg):
Not done reviewing. Just wanted to drop a thought on design/performance. Right now it schedules
a monitoring task for every operation. This has a cost linear to the number of operations
and adds a contended queue with a not great queue implementation (priority queue). My wild
guess is that there may be a workload we care about that will highlight this. If we do our
jobs it should bottleneck eventually as it is shared state accessed for every single operation.

What I would suggest is inverting the process. Have a thread wake up periodically and walk
the list of in progress operations looking for ones that need to be timed out. You can store
the in progress operation in a thread local. When the thread local is created it inserts into
a COW array list that the monitoring thread can iterate. That way there is no mutation of
shared state and it only enters shared state periodically when the the monitoring thread walks
the list.

This alternative approach has a cost linear to the number of threads dispatching tasks and
almost no coherence cost.

Just to validate my understanding, when a query is aborted it either sends no response or
sends an error response? I just want to be sure that it doesn't silently claim there is no
more data.

ReadCommandVerbHandler constructs an OpState with a string that is the CQL command. It would
be nice not to have to do any string construction there and if we did maybe try and cache
it somehow.

In DroppableRunnable you switch from System.currentTimeMillis() to nanoTime(). I always looking
for opportunities to avoid getting the current time (get it once and pass it around). We grab
the time once when reading a message so theoretically if there has been no blocking you don't
need to grab it again if we had ponies.

WRT to OpState. Use unsafe instead of AtomicReferenceFieldUpdater. ATRFU is not super fast
last I heard. If you are in the business of looking to avoid extra objects/indirection is
it possible to stuff the fields inside an existing object that is present everywhere OpState
is already used? mixins where art thou.

The OpMonitoring tests don't wait long enough. I would write those to sleep, and then spin
for a looong time checking to see if the cancellation took place. The reason I write this
way is that I don't want a huge hiccup on the test node to randomly throw up a failing test.
It will run fast most of the time since it is spinning.

> Abort in-progress queries that time out
> ---------------------------------------
>
>                 Key: CASSANDRA-7392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>             Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node is overloaded)
but not queries that time out while being processed.  (Particularly common for index queries
on data that shouldn't be indexed.)  Adding the latter and logging when we have to interrupt
one gets us a poor man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message