cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ricardo Bartolome (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13663) Cassandra 3.10 crashes without dump
Date Tue, 19 Sep 2017 14:08:01 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171766#comment-16171766
] 

Ricardo Bartolome commented on CASSANDRA-13663:
-----------------------------------------------

Does anybody have news about this issue?

We are experiencing a similar issue even in our case we don't see any oom-killer errors. Our
scenario is:
* 12 x i3.2xlarge instances (8 CPU, 64GB memory)
* Storage per node is ~400GB
* Cassandra 3.9 (looking for an upgrade, but nothing appears listed in the CHANGELOG related
to this issue)
* Oracle JVM build 1.8.0_112-b15

We also have 1-2 dead nodes a week. We have been enabling HeapDumps on several nodes to help
to identify but so far didn't reproduce in the nodes that have it enabled (neither if it will
contain some useful information, if the problem is off-heap).

Some off-heap memory usage statistics gathered through JMX exploring the following beans:
* org.apache.cassandra.metrics:name=BloomFilterOffHeapMemoryUsed,type=Table
* org.apache.cassandra.metrics:name=AllMemtablesOffHeapSize,type=Table
* org.apache.cassandra.metrics:name=CompressionMetadataOffHeapMemoryUsed,type=Table

h4. BloomFilterOffHeapMemoryUsed (~ 1.6GB)
{code}
Value = 1619193928;
Value = 1546767024;
Value = 1669879216;
Value = 1576772336;
Value = 1567804792;
Value = 1605097824;
Value = 1608551904;
Value = 1502500424;
Value = 1363705192;
Value = 1259389280;
Value = 1671282736;
{code}

h4. AllMemtablesOffHeapSize (~700MB)
{code}
Value = 692597111;
Value = 617691154;
Value = 693412363;
Value = 708732630;
Value = 664297343;
Value = 705367430;
Value = 626936323;
Value = 652724309;
Value = 700223457;
Value = 666516571;
Value = 682268720;
{code}

h4. CompressionMetadataOffHeapMemoryUsed (~110MB)
{code}
Value = 111307336;
Value = 105718312;
Value = 110638576;
Value = 111370032;
Value = 105979296;
Value = 108963456;
Value = 111122216;
Value = 99788200;
Value = 95279232;
Value = 106130392;
Value = 113217400;
{code}

Any idea what else I can look at?


> Cassandra 3.10 crashes without dump
> -----------------------------------
>
>                 Key: CASSANDRA-13663
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13663
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Matthias Otto
>            Priority: Minor
>         Attachments: 2017-07-04 10_48_34-CloudWatch Management Console.png, cassandra
debug.log, cassandra system.log, RamUsageExamle1.png, RamUsageExample2.png
>
>
> Hello. My company runs a 5 node Cassandra cluster. For the last few weeks, we have had
a sporadic issue where one of the servers crashes without creating a dump file and without
any error messages in the logs. If one restarts the service (which we have by now scripted
to happen automatically), the servers resumes work with no complaint.
> Log files of the time of the last crash are attached, thou again they do not log any
crash happening.
> Regarding out setup, we are running these servers on AMazon AWS, with 3 volumes per server,
one for the system, one for data and one for the commitlog. When a crash happens, we can observe
a sudden spike of read activity on the commitlog volume. All of these have ample free space.
Aspecially the system volume has more then enough free space so that a dump could be written.
> The servers are Ubuntu 16.04 servers and Cassandra is installed from the apt-get packet
for version 3.10.
> It is worth noting that these crashes happen more often when nodetool is running either
repair job or a backup job, but this is by no means always the case. As for frequency, we
have had about 1-2 crashes per week for the last month.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message