cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Piavlo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-9720) half open tcp connections to cassandra cluster nodes cause 100% cpu load
Date Thu, 02 Jul 2015 20:31:05 GMT
Alexander Piavlo created CASSANDRA-9720:
-------------------------------------------

             Summary: half open tcp connections to cassandra cluster nodes cause 100% cpu
load
                 Key: CASSANDRA-9720
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9720
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Alexander Piavlo


cassandra 2.1.5

We spotted that few of the nodes in our cluster got sudden cpu 100% spike which never ended.
It's not a GC not increased reads/writes nodes.
What we saw is that those nodes that have 100% cpu load
all have some connections (file descriptios) with "can't identify protocol"
which indicate those must be unprolery handled abrupt connections by cassandra process.
http://stackoverflow.com/questions/7911840/seeing-too-many-lsof-cant-identify-protocol

We are pretty sure what triggered this is the spark cassandra connector
which sudenly started to get stuck in early discovery of cassandra nodes before running any
stages

ps. we had similar issues some time ago with earlier version, of 2.1.x cassandra branch, and
ended up solving the problerm by upgrading from spark1.2.1 to spark1.3.1 and also upgrading
spark datastax connecor accordingly. Now looks like the problem is back with 99.9% same symptoms

ps2. We have observed previously several java/cassandra unrelated processes (mainly in php-cli)
go crazy with cpu then they had "can't identify protocol" symphoms



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message