cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-10249) Make buffered read size configurable
Date Tue, 17 Nov 2015 14:52:11 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008756#comment-15008756
] 

Aleksey Yeschenko edited comment on CASSANDRA-10249 at 11/17/15 2:52 PM:
-------------------------------------------------------------------------

I don't know where you got the 4K number for 2.2. So a patch for 2.2 is actually required.
Anyway, pushed a 2.1 and a 2.2 branch (with some minor changes to the former to make it idiomatic),
now awaiting CI results.

||branch||testall||dtest||
|[10249-2.1|https://github.com/iamaleksey/cassandra/tree/10249-2.1]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.1-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.1-dtest]|
|[10249-2.2|https://github.com/iamaleksey/cassandra/tree/10249-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.2-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.2-dtest]|


was (Author: iamaleksey):
I'm don't know where you got the 4K number for 2.2. So a patch for 2.2 is actually required.
Anyway, pushed a 2.1 and a 2.2 branch (with some minor changes to the former to make it idiomatic),
now awaiting CI results.

||branch||testall||dtest||
|[10249-2.1|https://github.com/iamaleksey/cassandra/tree/10249-2.1]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.1-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.1-dtest]|
|[10249-2.2|https://github.com/iamaleksey/cassandra/tree/10249-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.2-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-10249-2.2-dtest]|

> Make buffered read size configurable
> ------------------------------------
>
>                 Key: CASSANDRA-10249
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Albert P Tobey
>            Assignee: Albert P Tobey
>             Fix For: 2.1.x, 2.2.x
>
>         Attachments: Screenshot 2015-09-11 09.32.04.png, Screenshot 2015-09-11 09.34.10.png,
patched-2.1.9-dstat-lvn10.png, stock-2.1.9-dstat-lvn10.png, yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits over the network.
This causes problems throughput the system by wasting disk IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. The only requirement
to reproduce the issue is enough data to blow through the page cache. The default schema and
data size with cassandra-stress is sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 disk:network ratio.
That is to say, for 1MB/s of network IO, Cassandra was doing 300-500MB/s of disk reads, saturating
the drive.
> After applying this patch for standard IO mode https://gist.github.com/tobert/10c307cf3709a585a7cf
the ratio fell to around 100:1 on my local test rig. Latency improved considerably and GC
became a lot less frequent.
> I tested with 512 byte reads as well, but got the same performance, which makes sense
since all HDD and SSD made in the last few years have a 4K block size (many of them lie and
say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message