cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6995) Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage
Date Tue, 08 Apr 2014 05:46:17 GMT


Jason Brown commented on CASSANDRA-6995:

bq. [~xedin] Can we schedule requests to the appropriate stage directly from thrift selector

Well, to a degree, you could say we already do that already :) , it just happens late in StorageProxy/ARE
(but only when we're executing locally). However, if you mean (which I think you do, but correct
me if I am mistaken) "can we detect, further upstream (before SP), which stage we'll eventually
use, and use it locally on the coordinator", I think that is possible in both thrift-land
(CassandraServer) and cql3 (QueryProcessor or in the CQLStatement implementations, I believe).
However, I see a couple of issues there:

* You will always be incurring the thread context switch to one of the stages (even if you
are not reading locally, which is probably most of the use cases in the wild).
* A given node's stages will be contended for by both coordinator use and data node uses.
This perhaps would suggest a model similar to what [~benedict] mentioned earlier, a semaphore
to limit the uses of an individual stage. 

I do like the idea of stalling/blocking requests further upstream (closer to the caller),
and perhaps breaking it down by the type of operation (reads vs. writes vs. schema changes
vs ....). However, I think that might be different than the original intent of this ticket,
which is to eliminate the additional context switch when reading locally on the coordinator.

> Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage
> ----------------------------------------------------------------------------------------
>                 Key: CASSANDRA-6995
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.0.7
>         Attachments: 6995-v1.diff, syncread-stress.txt
> When performing a read local to a coordinator node, AbstractReadExecutor will create
a new SP.LocalReadRunnable and drop it into the read stage for asynchronous execution. If
you are using a client that intelligently routes  read requests to a node holding the data
for a given request, and are using CL.ONE/LOCAL_ONE, the enqueuing SP.LocalReadRunnable and
waiting for the context switches (and possible NUMA misses) adds unneccesary latency. We can
reduce that latency and improve throughput by avoiding the queueing and thread context switching
by simply executing the SP.LocalReadRunnable synchronously in the request thread. Testing
on a three node cluster (each with 32 cpus, 132 GB ram) yields ~10% improvement in throughput
and ~20% speedup on avg/95/99 percentiles (99.9% was about 5-10% improvement).

This message was sent by Atlassian JIRA

View raw message