cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-9793) Log when messages are dropped due to cross_node_timeout
Date Mon, 13 Jul 2015 22:59:05 GMT
Brandon Williams created CASSANDRA-9793:
-------------------------------------------

             Summary: Log when messages are dropped due to cross_node_timeout
                 Key: CASSANDRA-9793
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9793
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Brandon Williams
             Fix For: 2.1.x, 2.0.x


When a node has clock skew and cross node timeouts are enabled, there's no indication that
the messages were dropped due to the cross timeout, just that messages were dropped.  This
can errantly lead you down a path of troubleshooting a load shedding situation when really
you just have clock drift on one node.  This is also not simple to troubleshooting, since
you have to determine that this node will answer requests, but other nodes won't answer requests
from it.  If the problem goes away on a reboot (and the machine does one-shot time sync, not
continuos) it becomes even harder to detect because you're left with a weird piece of evidence
such as "it's fine after a reboot, but comes back in about X days every time."

It would help tremendously if there were a log message indicating how many messages (don't
need them broken down by type) were eagerly dropped due to the cross node timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message