incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject mmap:ed i/o and buffer sizes
Date Fri, 10 Dec 2010 20:45:04 GMT
I was looking closer at sliced_buffer_size_in_kb and
column_index_size_in_kb and reached the conclusion that for the
purpose of I/O, these are irrelevant when using mmap:ed I/O mode
(which makes sense, since there is no way to use a "buffer size" when
all you're doing is touching memory). The only effect is that
column_index_size_in_kb still affects the size at which indexing
triggers, which is as advertised.

Firstly, can anyone confirm/deny my interpretation?

Secondly, has anyone done testing as to the effects on mmap():ed I/O
on the efficiency (in terms of disk seeks) of reads on large data
sets? The CPU benefits of mmap() may be negated when disk-bound if the
read-ahead logic of the kernel interacts sub-optimally with
Cassandra's use-case. Potentially even reading more than a single page
could imply multiple seeks (assuming a loaded system with other I/O in
the queue) if there is no read-ahead until the first successive
access.

I have not checked what actually does happen, nor have I benchmarked
for comparison. But I'd be interested in hearing if people have
already addressed this in the past.

-- 
/ Peter Schuller

Mime
View raw message