cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4554) Log when a node is down longer than the hint window and we stop saving hints
Date Wed, 02 Jan 2013 17:02:12 GMT


Jonathan Ellis commented on CASSANDRA-4554:

All I intended here was to store a flag (in the peers table?) or a count (would need to be
a separate CF) for when we've skipped a hint because a replica was down longer than max_hint_window.
 If we really want to get fancy we can make this a replicated CF, i.e., not in the local-only
system KS.  (A system_replicated KS keeps looking useful; tracing data could go there too.)

Extending this to "does X need a repair" is substantially more complex (CASSANDRA-2405) so
I don't consider that in scope here.

Exposing other hint metrics is also a separate problem -- I note that the JMX call for counting
hints is O(n) and may even OOM.  Let's take that to a separate ticket as well.

P.S. I'm not a fan of switching hints-in-progress to a Cache, since that implies it's okay
to throw away entries because they can be rebuilt.  This is not the case.
> Log when a node is down longer than the hint window and we stop saving hints
> ----------------------------------------------------------------------------
>                 Key: CASSANDRA-4554
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2.1
>         Attachments: 0001-CASSANDRA-4554-add-hint-metrics.patch, 0001-CASSANDRA-4554-logging-to-system-table-v2.patch,
> We know that we need to repair whenever we lose a node or disk permanently (since it
may have had undelivered hints on it), but without exposing this we don't know when nodes
stop saving hints for a temporarily dead node, unless we're paying very close attention to
external monitoring.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message