cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Dennis <>
Subject Re: Dazed and confused with Cassandra on EC2 ...
Date Fri, 08 Oct 2010 01:05:55 GMT
Also, in general, you probably want to set Xms = Xmx (regardless of the
value you eventually decide on for that).

If you set them equal, the JVM will just go ahead and allocate that amount
on startup.  If they're different, then when you grow above Xms it has to
allocate more and move a bunch of stuff around.  It may have to do this
multiple times.  Note that it does this as the worst time possible (i.e.
under heavy load, which is likely what caused you to grow past Xms in the
first place).

On Thu, Oct 7, 2010 at 2:49 PM, Peter Schuller

> >  There's some words on the 'Net that - the recent pages on
> >  Riptano's site in fact - that strongly encourage scaling left
> >  and right, rather than beefing up the boxes - and certainly
> >  we're seeing far less bother from GC using a much smaller
> >  heap - previously we'd been going up to 16GB, or even
> >  higher.  This is based on my previous positive experiences
> >  of getting better performance from memory hog apps (eg.
> >  Java) by giving them more memory.  In any case, it seems
> >  that using large amounts of memory on EC2 is just asking
> >  for trouble.
> Keep in mind that while GC tends to be more efficient with larger heap
> sizes, that does not always translate into better overall performance
> when other things have to be considered. In particular, in the case of
> Cassandra, if you "waste" 10-15 gigs of RAM on the JVM heap for a
> Cassandra instances which could live with e.g. 1 GB, you're actively
> taking away those 10-15 gigs of RAM from the operating system to use
> for the buffer cache. Particularly if you're I/O bound on reads then,
> this could have very detrimental effects (assuming the data set is
> sufficiently small and locality is such that 15 GB of extra buffer
> cache makes a difference; usually, but not always, this is the case).
> So with Cassandra, in the general case, you definitely want to keep
> hour heap size reasonable in relation to the actual live set (amount
> of actually reachable data), rather than just cranking it up as much
> as possible.
> (The main issue here is also keeping it high enough to not OOM, given
> that exact memory demands are hard to predict; it would be absolutely
> great if the JVM was better at maintaining a reasonable heap size to
> live set size ratio so that much less tweaking of heap sizes was
> necessary, but this is not the case.)
> --
> / Peter Schuller

View raw message