cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
Date Sat, 17 May 2014 13:07:19 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000743#comment-14000743
] 

Benedict edited comment on CASSANDRA-4718 at 5/17/14 1:07 PM:
--------------------------------------------------------------

But the sep branch was actually faster more often than it was slower? And yes it routes intelligently,
but to both replicas...?

I've attached three graphs to visualise the output from Jason's test runs, that I hope express
better what I was trying to get across in my previous comment: that the sep branch is actually
faster in the workload that operates over a smaller domain (run3), and that it is also more
often faster for the disk bound workloads, but that I expect that the difference is most likely
random variation. 

The evidence is that
# run2 shows performance of both branches crossing each other at different points; 
# that run1 is faster universally for sep (and both run1 and run2 perform the same amount
of IO per operation); and 
# because, unless there is a bug, it should be very difficult for either patch to demonstrate
a major performance difference on disk bound workloads - so long as all read workers are scheduled,
the disk is exclusively what should define our throughput. 

This work is clearly disk bound as the same hardware was pushing 250k/s with similar record
sizes when exclusively in memory - we're seeing only 5% of that now. Unless possibly in-memory
index scans are occupying all of the time. 

I'm doing my best to produce workload on hardware I have available to me to absolutely 100%
rule out any such issue as (3), but my point is that it is very hard to get accurate consistent
numbers with which to draw strong conclusions when the difference we're measuring is smaller
than measurement noise.


was (Author: benedict):
But the sep branch was actually faster more often than it was slower? And yes it routes intelligently,
but to both replicas...?

I've attached three graphs to visualise the output from Jason's test runs, that I hope express
better what I was trying to get across in my previous comment: that the sep branch is actually
faster in the workload that operates over a smaller domain (run3), and that it is also more
often faster for the disk bound workloads, but that I expect that the difference is most likely
random variation. 

The evidence is that
# run2 shows performance of both branches crossing each other at different points; 
that run1 is faster universally for sep (and both run1 and run2 perform the same amount of
IO per operation); and 
# because, unless there is a bug, it should be very difficult for either patch to demonstrate
a major performance difference on disk bound workloads - so long as all read workers are scheduled,
the disk is exclusively what should define our throughput. 

This work is clearly disk bound as the same hardware was pushing 250k/s with similar record
sizes when exclusively in memory - we're seeing only 5% of that now. Unless possibly in-memory
index scans are occupying all of the time. 

I'm doing my best to produce workload on hardware I have available to me to absolutely 100%
rule out any such issue as (3), but my point is that it is very hard to get accurate consistent
numbers with which to draw strong conclusions when the difference we're measuring is smaller
than measurement noise.

> More-efficient ExecutorService for improved throughput
> ------------------------------------------------------
>
>                 Key: CASSANDRA-4718
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1.0
>
>         Attachments: 4718-v1.patch, PerThreadQueue.java, austin_diskbound_read.svg, aws.svg,
aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt,
jason_read.svg, jason_read_latency.svg, jason_run1.svg, jason_run2.svg, jason_run3.svg, jason_write.svg,
op costs of various queues.ods, stress op rate with various queues.ods, stress_2014May15.txt,
stress_2014May16.txt, v1-stress.out
>
>
> Currently all our execution stages dequeue tasks one at a time.  This can result in contention
between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue).
> One approach to mitigating this would be to make consumer threads do more work in "bulk"
instead of just one task per dequeue.  (Producer threads tend to be single-task oriented by
nature, so I don't see an equivalent opportunity there.)
> BlockingQueue has a drainTo(collection, int) method that would be perfect for this. 
However, no ExecutorService in the jdk supports using drainTo, nor could I google one.
> What I would like to do here is create just such a beast and wire it into (at least)
the write and read stages.  (Other possible candidates for such an optimization, such as the
CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.)
> AbstractExecutorService may be useful.  The implementations of ICommitLogExecutorService
may also be useful. (Despite the name these are not actual ExecutorServices, although they
share the most important properties of one.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message