cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered
Date Tue, 30 Jun 2015 00:17:04 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606713#comment-14606713
] 

Benedict commented on CASSANDRA-9681:
-------------------------------------

Regrettably, that heap dump also looks 100% healthy. Approximately 500Mb of memtable space
being used.

It is possible we're getting some funky reporting, somehow, but I can't see an obvious candidate
change. Could you possibly try obtaining another heap dump when under more significant pressure?

Would be also great to get the log files for the 26th onwards, since that looks to be where
the utilisation was most spiked.

> Memtable heap size grows and many long GC pauses are triggered
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-9681
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: C* 2.1.7, Debian Wheezy
>            Reporter: mlowicki
>            Assignee: Benedict
>            Priority: Critical
>             Fix For: 2.1.x
>
>         Attachments: cassandra.yaml, system.log.6.zip, system.log.7.zip, system.log.8.zip,
system.log.9.zip
>
>
> C* 2.1.7 cluster is behaving really bad after 1-2 days. {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
jumps to 7 GB (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
on 3/6 nodes in each data center and then there are many long GC pauses. Cluster is using
default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
> Before C* 2.1.5 memtables heap size was basically constant ~500MB (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
> After restarting all nodes is behaves stable for 1-2days. Today I've done that and long
GC pauses are gone (~18:00 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
The only pattern we've found so far is that long GC  pauses are happening basically at the
same time on all nodes in the same data center - even on the ones where memtables heap size
is not growing.
> Cliffs on the graphs are nodes restarts.
> Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same level -
https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
> Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message