cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10173) Compaction isn't cleaning out tombstones between hint deliveries
Date Tue, 25 Aug 2015 11:34:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711103#comment-14711103
] 

Aleksey Yeschenko commented on CASSANDRA-10173:
-----------------------------------------------

Full disclosure: I haven't looked at the logs at all just yet, this is merely my guess at
this point.

I'm reasonably confident that auto-compaction is actually disabled there. However, we can
only forcefully major-compact the sstables that aren't already compacting. And there are two
different events that might trigger hints replay in pre-3.0 C*: one is the scheduled replay
(allthethings) every 10 minutes, the other one is a node coming up. And the latter would also
trigger a major compaction before starting replay, but only for the sstables that aren't already
compacting.

Because of this overlap it's not guaranteed that a pre-replay compaction would clean out all
the tombstones. If this is what is indeed happening, one workaround would be to bound {{max_hints_delivery_threads}}
to 1 (the default in the .yaml is 2).

3.0 should be looking good, obviously, for more reasons than just lack of compaction. Not
directly related to the issue, but replay triggering in 3.0 is also simpler - we just do it
every 10 seconds, because it's cheap, and don't listen to node up events at all.

> Compaction isn't cleaning out tombstones between hint deliveries
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-10173
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10173
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Aleksey Yeschenko
>             Fix For: 2.2.x
>
>         Attachments: system (3).log
>
>
> 3 node cluster, 100M writes.  Same scenario as 10172:
> Test Start: 00:00:00
> Node 1 Killed: 00:05:48
> Node 2 Killed: 00:13:33
> Node 1 Started: 00:24:20
> Node 2 Started: 00:32:23
> Test Done: 00:38:33
> Node 1 hints replay finished: 00:56:16
> Node 2 hints replay finished: 01:00:16
> Node 3 hints replay finished: 02:08:00
> Log attached.  Note the tombstone_failure_threshold errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message