cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10421) Potential issue with LogTransaction as it only checks in a single directory for files
Date Fri, 09 Oct 2015 17:45:05 GMT


Ariel Weisberg commented on CASSANDRA-10421:

bq.The reason is that we don't want to delete files by mistake, as in users coping files from
backup without removing a partial txn log that happened to have obsoleted the very same files.
There is a comment in deleteRecord() but it doesn't seem complete so we should probably add
more comments about this.
OK yeah that sounds thorny. I get where this is coming from now.

bq. The current approach is the most permissive approach, in that we clean up a transaction
even if some final files are missing (because a disk was removed for example). My reasoning
is that from the transaction point of view, it did complete or abort and so it should not
keep old files if the new files have been removed for another reason, or should it?
After looking at the discussion CASSANDRA-7066 I think I have a better understanding of the
situations the log is supposed to address. Key points are that correctness is maintained even
if atomicity is lost for rollback. This is just trying to clean up after in progress operations,
and as a better approach compared to earlier/other mechanisms (system tables, renaming, ancestor
analysis). The window at the end where we might fail to clean up because we fail to write
one log, or fail to read it on restart aren't super important.

I am not sure if losing atomicity on commit is safe. I think it isn't because you could have
a log asking you to remove some files that are ancestors, but no log records telling you to
add their replacements in a different location. Do the replacements just silently get added
in that case? If so then yes I think it's fine.

When disks start going bad and files start disappearing the operator already has to fall back
on repair and restoring from backups. But Jonathan did voice a preference for extra data and
more compaction when things are going poorly so that is my guiding logic right now. Leave
a little bit more data when you have full on corruption/missing files and focus on the common
case of restarts. [~benedict] WDYT? 

I have some questions about what happens when at runtime when we go to create these log files.
* When writing a commit if we fail to write the commit to all files does that mean we don't
attempt to transition to the new state?
* The transition to the new state is always going to be applying deletions to obsolete sstables
* What about failing to add a record?
* If that fails does it abort the transaction?
* Would that block compaction progress?

These are all high level questions just from me figuring out how things work. I'll look at
the tests for the changes you have implemented so far now.

> Potential issue with LogTransaction as it only checks in a single directory for files
> -------------------------------------------------------------------------------------
>                 Key: CASSANDRA-10421
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Stefania
>            Priority: Blocker
>             Fix For: 3.0.0 rc2
> When creating a new LogTransaction we try to create the new logfile in the same directory
as the one we are writing to, but as we use {{[directories.getDirectoryForNewSSTables()|]}}
this might end up in "any" of the configured data directories. If it does, we will not be
able to clean up leftovers as we check for files in the same directory as the logfile was
> cc [~Stefania]

This message was sent by Atlassian JIRA

View raw message