cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mlowicki (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9681) Memtable heap size grows and many long GC pauses are triggered
Date Wed, 01 Jul 2015 07:34:04 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609685#comment-14609685
] 

mlowicki commented on CASSANDRA-9681:
-------------------------------------

So far so good - https://www.dropbox.com/s/ad8te1g6iz2wofe/Screenshot%202015-07-01%2009.31.00.png?dl=0.
I'll let you know if it'll degrade or not. GC pauses we've talked about yesterday are probably
caused by misbehaving Logstash or Kibana as I've checked using jstat and gc.log that everything
is fine on this boxes.

> Memtable heap size grows and many long GC pauses are triggered
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-9681
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9681
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: C* 2.1.7, Debian Wheezy
>            Reporter: mlowicki
>            Assignee: Benedict
>            Priority: Critical
>             Fix For: 2.1.x
>
>         Attachments: cassandra.yaml, db5.system.log, db5.system.log.1.zip, db5.system.log.2.zip,
db5.system.log.3.zip, schema.cql, system.log.6.zip, system.log.7.zip, system.log.8.zip, system.log.9.zip
>
>
> C* 2.1.7 cluster is behaving really bad after 1-2 days. {{gauges.cassandra.jmx.org.apache.cassandra.metrics.ColumnFamily.AllMemtablesHeapSize.Value}}
jumps to 7 GB (https://www.dropbox.com/s/vraggy292erkzd2/Screenshot%202015-06-29%2019.12.53.png?dl=0)
on 3/6 nodes in each data center and then there are many long GC pauses. Cluster is using
default heap size values ({{-Xms8192M -Xmx8192M -Xmn2048M}})
> Before C* 2.1.5 memtables heap size was basically constant ~500MB (https://www.dropbox.com/s/fjdywik5lojstvn/Screenshot%202015-06-29%2019.30.00.png?dl=0)
> After restarting all nodes is behaves stable for 1-2days. Today I've done that and long
GC pauses are gone (~18:00 https://www.dropbox.com/s/7vo3ynz505rsfq3/Screenshot%202015-06-29%2019.28.37.png?dl=0).
The only pattern we've found so far is that long GC  pauses are happening basically at the
same time on all nodes in the same data center - even on the ones where memtables heap size
is not growing.
> Cliffs on the graphs are nodes restarts.
> Used memory on boxes where {{AllMemtabelesHeapSize}} grows, stays at the same level -
https://www.dropbox.com/s/tes9abykixs86rf/Screenshot%202015-06-29%2019.37.52.png?dl=0.
> Replication factor is set to 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message