cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10421) Potential issue with LogTransaction as it only checks in a single directory for files
Date Wed, 07 Oct 2015 23:05:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947768#comment-14947768
] 

Ariel Weisberg commented on CASSANDRA-10421:
--------------------------------------------

Can you set the ticket to patch available?

If rc2 is end of the week I think it's going to be hard to get to +1 for a new sub-system
and patch like this. I'm not going to be able to do more review until tomorrow evening my
time.

bq. We don't duplicate all records to all files, only the final commit or abort flag is written
to all files. When we read the files on startup we collect all the sstable records from all
existing txn files and check that the final flag record is the same in all files (but we do
accept if some files are missing the last record and in this case we just warn). Therefore,
if a file is lost we continue with the transaction processing but we do not touch the sstables
in the folder of that file. Chances are either the entire disk was lost or the user deleted
the file and in this case the user probably wanted to keep the sstables. Does this make sense?

Is there an advantage to writing only the commit record to all the files? Seems conceptually
easier for them to all be the same log and since it is low traffic there is no performance
motivation. Was there a discussion somewhere else where it seemed like people might delete
the file? Are we really ok with losing atomicity if they don't lose the whole disk?

If they all had all the records you could just read the contents from any file that has a
commit record.

* [Can you just have it do nothing if it is called multiple times?|https://github.com/stef1927/cassandra/commit/49a2d5f289c98fcf646272cfab777d20001f9b9e#diff-d8721a4dd04f4f35b65e7edeb6c883f6R198].
Maybe save a headache down the road.
* [Why is this check not necessary anymore?|https://github.com/stef1927/cassandra/commit/49a2d5f289c98fcf646272cfab777d20001f9b9e#diff-d8721a4dd04f4f35b65e7edeb6c883f6L161]
* [Not part of your current work, but relying on modified time for files seems suspect to
me, file contents should have the modified time so copying and other operations don't change
it|https://github.com/stef1927/cassandra/commit/49a2d5f289c98fcf646272cfab777d20001f9b9e#diff-d8721a4dd04f4f35b65e7edeb6c883f6R283]

I got through a first pass of the implementation. I'm looking at the tests now.


> Potential issue with LogTransaction as it only checks in a single directory for files
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10421
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10421
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Stefania
>            Priority: Blocker
>             Fix For: 3.0.0 rc2
>
>
> When creating a new LogTransaction we try to create the new logfile in the same directory
as the one we are writing to, but as we use {{[directories.getDirectoryForNewSSTables()|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LogTransaction.java#L125]}}
this might end up in "any" of the configured data directories. If it does, we will not be
able to clean up leftovers as we check for files in the same directory as the logfile was
created: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LogRecord.java#L163
> cc [~Stefania]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message