cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Rudduck (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-7220) Nodes hang with 100% CPU load
Date Mon, 18 Aug 2014 04:58:18 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Rudduck updated CASSANDRA-7220:
--------------------------------------

    Attachment: system.log

That is possible - we are running 2.0.8 at the moment IIRC. Attached is the system log of
a node that just recently exhibited the same behavior, although it did not stay at 100% CPU
for very long after it stopped responding. It is possible that it is not related to this bug
directly although the general symptoms seem similar. If I need to open a new issue let me
know.

> Nodes hang with 100% CPU load
> -----------------------------
>
>                 Key: CASSANDRA-7220
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7220
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: C* 2.0.7
> 4 nodes cluster on 12 core machines
>            Reporter: Robert Stupp
>            Assignee: Ryan McGuire
>         Attachments: c-12-read-100perc-cpu.zip, system.log
>
>
> I've ran a test that both reads and writes rows.
> After some time, all writes succeeded and all reads stopped.
> Two of the four nodes have 16 of 16 threads of the "ReadStage" thread pool running. The
number of pending task continuouly grows on these nodes.
> I have attached outputs of the stack traces and some diagnostic output from "nodetool
tpstats"
> "nodetool status" shows all nodes as UN.
> I had run that test previously without any issues in with the same configuration.
> Some "specials" from cassandra.yaml:
> - key_cache_size_in_mb: 1024
> - row_cache_size_in_mb: 8192
> The nodes running at 100% CPU are "node2" and "node3". node1&node4 are fine.
> I'm not sure if it is reproducable - but it's definitly not a good behaviour.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message