I just used Linux "Top" to see the number of virtual memory used by JVM. When you turned on mmap, this number is equal to the size of your live sstables. And if you turn off mmap the VIRT will be close to the xmx of your jvm.

Anyway, for mmap, in order for you to access the data in the buffer or virtual address, OS has to read/page in the data to a block of physical memory and assign your virtual address to that physical memory block. So if you use random partitioner you'll most likely force Linux to page in/out all the time. In this case, disabling mmap and let Cassandra to use random file access seems to make more sense. mmap should be used when you have enough ram for OS to cache most or all of your data files.


On Thu, May 6, 2010 at 10:49 AM, Vick Khera <vivek@khera.org> wrote:
On Thu, May 6, 2010 at 1:06 PM, Weijun Li <weijunli@gmail.com> wrote:
> In this case using mmap will cause Cassandra to use sometimes > 100G virtual
> memory which is much more than the physical ram, since we are using random
> partitioner the OS will be busy doing swap.

mmap uses the virtual address space to reference bits on the disk; it
does *NOT* use physical or virtual memory to copy that data other than
perhaps any disk buffer cache from reading the file (which you would
have anyhow).  Your memory usage tools will report high memory usage
because they tell you how much virtual address space you have