cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6476) Assertion error in MessagingService.addCallback
Date Thu, 12 Dec 2013 16:19:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846405#comment-13846405
] 

Sylvain Lebresne commented on CASSANDRA-6476:
---------------------------------------------

bq. Is it possible a bug in native compression code is corrupting random crap elsewhere in
the JVM?

I have no clue, that would be a pretty serious JVM bug imo if that was the case. It would
also be uncanny for "random corrupted crap" to trigger the same assertion on different nodes
(but well, everything is possible). All I can say in that matter is that the native protocol
uses the same compression libs than sstable compression and in basically the same way.

> Assertion error in MessagingService.addCallback
> -----------------------------------------------
>
>                 Key: CASSANDRA-6476
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6476
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Cassandra 2.0.2 DCE
>            Reporter: Theo Hultberg
>            Assignee: Sylvain Lebresne
>
> Two of the three Cassandra nodes in one of our clusters just started behaving very strange
about an hour ago. Within a minute of each other they started logging AssertionErrors (see
stack traces here: https://gist.github.com/iconara/7917438) over and over again. The client
lost connection with the nodes at roughly the same time. The nodes were still up, and even
if no clients were connected to them they continued logging the same errors over and over.
> The errors are in the native transport (specifically MessagingService.addCallback) which
makes me suspect that it has something to do with a test that we started running this afternoon.
I've just implemented support for frame compression in my CQL driver cql-rb. About two hours
before this happened I deployed a version of the application which enabled Snappy compression
on all frames larger than 64 bytes. It's not impossible that there is a bug somewhere in the
driver or compression library that caused this -- but at the same time, it feels like it shouldn't
be possible to make C* a zombie with a bad frame.
> Restarting seems to have got them back running again, but I suspect they will go down
again sooner or later.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message