cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <>
Subject Re: memory consuption
Date Fri, 18 Feb 2011 08:18:11 GMT
> Jonathan,
> When you get time could you please explain that a little more. Got a feeling
> I'm about to learn something :)

I'm not Jonathan, but: The operating system's virtual memory system
supports mapping files into a process' address space. This will "use"
virtual memory; i.e. address space. On 32 bit systems this was a
concern recently since running out of address space was a practical
concern; with 64 bit (even if you can't address the full 64 bit) this
is no longer an issue for a while - making virtual address space
essentially "free".

What matters from the perspective of "memory use" in the sense as it
is normally meant, is the amount of data allocated on brk():ed or
mmap():ed /dev/zero, which represent real memory used (or possibly
swap space, but unless the memory is never again accessed that's
usually not interesting from the point of view of "how much memory do
I need").

The key issue is that for a mmap():ed file, there is never a need to
retain the data in physical memory (=resident). Thus, whatever you do
keep resident in physical memory is essentially just there as a cache,
in the same way as normal I/O will cause the kernel page cache to
retain data that you read/write. The different between the normal I/O
and mmap() is that in the mmap() case the memory is actually mapped to
the process, thus affecting the virtual size as reported by top. The
main argument for using mmap() instead of standard I/O is the fact
that reading entails just touching memory - in the case of the memory
being resident, you just read it - you don't even take a page fault
(so no overhead in entering the kernel and doing a semi-context

A downside with mmap() is that you have less control over how I/O is
done (you can't say "read 60 MB from here", but instead traverse pages
hoping prefetch and/or read-ahead will help you; this can be mitigated
with posix_fadvise(), but then you're back to doing syscalls).

The other effect with mmap() is that it seems to affect the sense the
kernel has of the priority of different pages in terms of what to drop
or swap out, such that mmap() has a tendency to cause swapping out of
the JVM heap. But this is not because the process actually uses more
memory as such.

I didn't read it now but scrolling through it seems the wikipedia
article is a pretty good intro:

/ Peter Schuller

View raw message