cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
Date Fri, 14 Aug 2015 03:24:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696395#comment-14696395
] 

Stefania commented on CASSANDRA-7066:
-------------------------------------

[~benedict]:

* I've incorporated your suggestions, thanks.

* I've rebased to 3.0.

* I've fixed utests on Windows as it was easier to do this on this branch (see CASSANDRA-10035).

* I've implemented the mechanism to choose between a hard failure or ignoring a txn log file
after a few failures. Please note the following:
** I did not switch the hard failure flag to true for any clients except for the new sstableutil
tool. I am not sure which ones are good candidates, perhaps all standalone tools or when we
read the snapshot directories? 
** We keep a single failure counter for all files, strictly speaking we could have more than
one txn file at a time but I wasn't sure if you wanted to complicate things this much.
** We ignore txn log files after failing a few times if hardFailure is false but we still
throw for other errors, rather than returning an empty list.

> Simplify (and unify) cleanup of compaction leftovers
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>            Priority: Minor
>              Labels: benedict-to-commit, compaction
>             Fix For: 3.0 alpha 1
>
>         Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, which we use
to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a
bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files,
or conversely not cleanup files that have been superceded); and 2) it's only used for a regular
compaction - no other compaction types are guarded in the same way, so can result in duplication
if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and on startup
we simply delete any sstables that occur in the union of all ancestor sets. This way as soon
as we finish writing we're capable of cleaning up any leftovers, so we never get duplication.
It's also much easier to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message