cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-7404) Use direct i/o for sequential operations (compaction/streaming)
Date Tue, 30 Jun 2015 16:03:05 GMT


Ariel Weisberg commented on CASSANDRA-7404:

I don't remember how the buffer sizing worked, but it's not as simple as just auto-sizing
using an existing policy. With direct IO you have to manage your own read ahead for spinning
disk because the kernel isn't going to do it for you in the page cache. Then you have to not
run out of memory in the worst case scenario where too many tables are being merged together.

That is how we ended up with the hybrid where a manageable number of files are opened with
a big buffer to allow large reads, and if too many files are open we stop doing direct IO
and let the kernel manage read ahead and memory via the page cache.

tl;dr buffer pooling with direct IO needs to be done carefully to bound maxmimum memory usage
and seeks on spinning disks. I haven't looked at how off heap buffers are managed in the issues
you reference so I don't know what is/isn't already solved.

> Use direct i/o for sequential operations (compaction/streaming)
> ---------------------------------------------------------------
>                 Key: CASSANDRA-7404
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 3.x
> Investigate using linux's direct i/o for operations where we read sequentially through
a file (repair and bootstrap streaming, compaction reads, and so on). Direct i/o does not
go through the kernel page page, so it should leave the hot cache pages used for live reads
> Note: by using direct i/o, we will probably take a performance hit on reading the file
we're sequentially scanning through (that is, compactions may get slower), but the goal of
this ticket is to limit the impact of these background tasks on the main read/write functionality.
Of course, I'll measure any perf hit that is incurred, and see if there's any mechanisms to
mitigate it.

This message was sent by Atlassian JIRA

View raw message