cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10421) Potential issue with LogTransaction as it only checks in a single directory for files
Date Thu, 15 Oct 2015 02:10:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958185#comment-14958185
] 

Stefania commented on CASSANDRA-10421:
--------------------------------------

bq. Would this be what happens? We don't validate checksums of entire files at startup. Or
is there a flag somewhere else that allows us to know which files are incomplete?

I'm not super expert in reading sstables but if the metadata (statistics file) is missing
the sstable won't be loaded at startup and the metadata is created in BTW {{doPrepare}} after
the data file is completed. So normally incomplete data files should be detected at startup.


bq. I just noticed that it seems to open each file every time to append a record? Since each
record is only one table seems like that could end up being a little slow although most transactions
won't touch that many tables so maybe it's not an issue. It syncs the folder descriptor every
time it writes to a file? Seems like the folder only needs to be synced once when the file
is created or is that done as a side effect for other operations? Is the intent to sync the
file for each record and not the directory?

We need to sync the folder only when files are created or deleted. At the moment the txn log
file is created lazily when the first record is added, hence the sync when appending records.
Now that we have a wrapper class around the file and folder descriptor, {{LogReplica}}, and
since this class happens to need a close method to release the folder descriptor, we can keep
a file writer open and so I've changed it to this effect. This might break some unit tests
on Windows however, so I've started CI on Windows as well.

bq. LogFile.deleteRecords() applies records of a specific type? Just based on the name I thought
it was modifying the log contents somehow. Seems like it and the chain of delegating calls
should be named something else like applyRecords.

It deletes the files of the records with a specific type. I've renamed {{deleteRecords}} to
{{deleteFilesForRecordsOfType}} and {{deleteRecord}} to {{deleteRecordFiles}}.

bq. Can you add a comment that the LogFile.records is a LinkedHashSet because it must be order
preserving? When I saw it floating around without the LHS type and just Set I was confused.
Wouldn't want to it change into something that is not order preserving later on.

Sure, it done.

> Potential issue with LogTransaction as it only checks in a single directory for files
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10421
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10421
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Stefania
>            Priority: Blocker
>             Fix For: 3.0.0 rc2
>
>
> When creating a new LogTransaction we try to create the new logfile in the same directory
as the one we are writing to, but as we use {{[directories.getDirectoryForNewSSTables()|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LogTransaction.java#L125]}}
this might end up in "any" of the configured data directories. If it does, we will not be
able to clean up leftovers as we check for files in the same directory as the logfile was
created: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LogRecord.java#L163
> cc [~Stefania]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message