incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Talbot <btal...@aeriagames.com>
Subject Re: constant CMS GC using CPU time
Date Fri, 19 Oct 2012 17:59:02 GMT
ok, let me try asking the question a different way ...

How does cassandra use memory and how can I plan how much is needed?  I
have a 1 GB memtable and 5 GB total heap and that's still not enough even
though the number of concurrent connections and garbage generation rate is
fairly low.

If I were using mysql or oracle, I could compute how much memory could be
used by N concurrent connections, how much is allocated for caching, temp
spaces, etc.  How can I do this for cassandra?  Currently it seems like the
memory used scales with the amount of bytes stored and not with how busy
the server actually is.  That's not such a good thing.

-Bryan



On Thu, Oct 18, 2012 at 11:06 AM, Bryan Talbot <btalbot@aeriagames.com>wrote:

> In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11
> (64-bit), the nodes are often getting "stuck" in state where CMS
> collections of the old space are constantly running.
>
> The JVM configuration is using the standard settings in cassandra-env --
> relevant settings are included below.  The max heap is currently set to 5
> GB with 800MB for new size.  I don't believe that the cluster is overly
> busy and seems to be performing well enough other than this issue.  When
> nodes get into this state they never seem to leave it (by freeing up old
> space memory) without restarting cassandra.  They typically enter this
> state while running "nodetool repair -pr" but once they start doing this,
> restarting them only "fixes" it for a couple of hours.
>
> Compactions are completing and are generally not queued up.  All CF are
> using STCS.  The busiest CF consumes about 100GB of space on disk, is write
> heavy, and all columns have a TTL of 3 days.  Overall, there are 41 CF
> including those used for system keyspace and secondary indexes.  The number
> of SSTables per node currently varies from 185-212.
>
> Other than frequent log warnings about "GCInspector  - Heap is 0.xxx
> full..." and "StorageService  - Flushing CFS(...) to relieve memory
> pressure" there are no other log entries to indicate there is a problem.
>
> Does the memory needed vary depending on the amount of data stored?  If
> so, how can I predict how much jvm space is needed?  I don't want to make
> the heap too large as that's bad too.  Maybe there's a memory leak related
> to compaction that doesn't allow meta-data to be purged?
>
>
> -Bryan
>
>
> 12 GB of RAM in host with ~6 GB used by java and ~6 GB for OS and buffer
> cache.
> $> free -m
>              total       used       free     shared    buffers     cached
> Mem:         12001      11870        131          0          4       5778
> -/+ buffers/cache:       6087       5914
> Swap:            0          0          0
>
>
> jvm settings in cassandra-env
> MAX_HEAP_SIZE="5G"
> HEAP_NEWSIZE="800M"
>
> # GC tuning options
> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
> JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops"
>
>
> jstat shows about 12 full collections per minute with old heap usage
> constantly over 75% so CMS is always over the
> CMSInitiatingOccupancyFraction threshold.
>
> $> jstat -gcutil -t 22917 5000 4
> Timestamp         S0     S1     E      O      P     YGC     YGCT    FGC
>  FGCT     GCT
>        132063.0  34.70   0.00  26.03  82.29  59.88  21580  506.887 17523
> 3078.941 3585.829
>        132068.0  34.70   0.00  50.02  81.23  59.88  21580  506.887 17524
> 3079.220 3586.107
>        132073.1   0.00  24.92  46.87  81.41  59.88  21581  506.932 17525
> 3079.583 3586.515
>        132078.1   0.00  24.92  64.71  81.40  59.88  21581  506.932 17527
> 3079.853 3586.785
>
>
> Other hosts not currently experiencing the high CPU load have a heap less
> than .75 full.
>
> $> jstat -gcutil -t 6063 5000 4
> Timestamp         S0     S1     E      O      P     YGC     YGCT    FGC
>  FGCT     GCT
>        520731.6   0.00  12.70  36.37  71.33  59.26  46453 1688.809 14785
> 2130.779 3819.588
>        520736.5   0.00  12.70  53.25  71.33  59.26  46453 1688.809 14785
> 2130.779 3819.588
>        520741.5   0.00  12.70  68.92  71.33  59.26  46453 1688.809 14785
> 2130.779 3819.588
>        520746.5   0.00  12.70  83.11  71.33  59.26  46453 1688.809 14785
> 2130.779 3819.588
>
>
>
>

Mime
View raw message