incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: Severe Reliability Problems - 0.7 RC2
Date Mon, 20 Dec 2010 19:49:22 GMT
> be correlated is the flushing of memtables tables. One of the strangest
> stats I am getting when in this state is memory paging: 3727168.00 pages
> scanned/second (see sar -B output). Occasionally, if I leave the process
> alone (~1 h) it recovers (maybe 1 in 5 times), otherwise the only way to

Sounds to me like the Cassandra process is triggering something along
the lines of fast-path page cache eviction or something similar. The
fact that you see Cassandra in 100% system (as opposed to user) CPU
and you have a huge number of pages scanned, certainly sounds like
you're hitting an edge case or bug in the virtual memory system in the
kernel. The JVM can't really do much about it if it's in a syscall
that never returns...

There were a couple of threads on lkml recently that may be relevant,
but I have to run so I can't find the URL:s atm (todo later tonight).

Is anyone aware of a way to get a kernel stack trace for a given
process on a running system?

Cargo cult solution: Upgrade the kernel :)

-- 
/ Peter Schuller

Mime
View raw message