cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Comment Edited] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
Date Sat, 17 May 2014 13:02:17 GMT


Benedict edited comment on CASSANDRA-4718 at 5/17/14 1:01 PM:

But the sep branch was actually faster more often than it was slower? And yes it routes intelligently,
but to both replicas...?

I've attached three graphs to visualise the output from Jason's test runs, that I hope express
better what I was trying to get across in my previous comment: that the sep branch is actually
faster in the workload that operates over a smaller domain (run3), and that it is also more
often faster for the disk bound workloads, but that I expect that the difference is most likely
random variation. The evidence is that run2 shows both crossing each other at different points,
run1 is faster universally for sep (and both perform the same amount of IO per operation)
and because, unless there is a bug, it should be very difficult for either patch to demonstrate
a major difference in performance on disk bound workloads - so long as all read workers are
scheduled, the disk is exclusively what should define our throughput. I'm doing my best to
produce workload on hardware I have available to me to rule out any such issue, but my point
is that it is very hard to get accurate consistent numbers with which to draw strong conclusions
when the difference we're measuring is smaller than measurement noise.

was (Author: benedict):
But like I said, the sep branch was actually faster more often than it was slower? And yes
it routes intelligently, but to both replicas...?

> More-efficient ExecutorService for improved throughput
> ------------------------------------------------------
>                 Key: CASSANDRA-4718
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1.0
>         Attachments: 4718-v1.patch,, austin_diskbound_read.svg, aws.svg,
aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt,
jason_read.svg, jason_read_latency.svg, jason_run1.svg, jason_run2.svg, jason_run3.svg, jason_write.svg,
op costs of various queues.ods, stress op rate with various queues.ods, stress_2014May15.txt,
stress_2014May16.txt, v1-stress.out
> Currently all our execution stages dequeue tasks one at a time.  This can result in contention
between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue).
> One approach to mitigating this would be to make consumer threads do more work in "bulk"
instead of just one task per dequeue.  (Producer threads tend to be single-task oriented by
nature, so I don't see an equivalent opportunity there.)
> BlockingQueue has a drainTo(collection, int) method that would be perfect for this. 
However, no ExecutorService in the jdk supports using drainTo, nor could I google one.
> What I would like to do here is create just such a beast and wire it into (at least)
the write and read stages.  (Other possible candidates for such an optimization, such as the
CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.)
> AbstractExecutorService may be useful.  The implementations of ICommitLogExecutorService
may also be useful. (Despite the name these are not actual ExecutorServices, although they
share the most important properties of one.)

This message was sent by Atlassian JIRA

View raw message