cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9627) fsync should not be "best effort" (and silently fail on e.g. windows)
Date Sun, 21 Jun 2015 17:13:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595132#comment-14595132
] 

Benedict commented on CASSANDRA-9627:
-------------------------------------

If that's the case, I'm reasonably happy to reduce the priority and change the scope of the
ticket.

[~JoshuaMckenzie]: Could you confirm before I do this that we're certain an fsync of the file
creates a happens-before relation for all directory modifications that have come before? If
repairing the directory results in any slip backwards or forwards in time of a modification
to the structure, we could be SOL.

I'm particularly thinking of this in relation to CASSANDRA-7066, where we'll be relying on
the (non-)\?presence of files to ensure transactional swapping of our live sstable contents.
The risk exists with or without CASSANDRA-7066, but it will widen our surface of exposure
pretty significantly if we're wrong, since we will in rapid succession make a number of directory
modifications that all require a happens-before guarantee.

> fsync should not be "best effort" (and silently fail on e.g. windows)
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-9627
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9627
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Priority: Blocker
>             Fix For: 2.2.0 rc2
>
>
> Currently we make an effort to synchronize both the file contents and the directory contents.
Both are essential to ensure no data loss. Currently we just try to do this, and ignore the
problem if we can't. Presumably this behaviour was to "sort of" support Windows (i.e. not
crash). Now we officially support Windows, we need to behave better, and really IMO we should
_never_ for any platform ignore a failure here. It should be part of our pre-flight checks:
if we cannot do it, we cannot run safely.
> It looks like this may be supported trivially through FileChannel, by opening one on
the directory itself (and calling force()), although it's not clear if this will still be
supported in Java 9 [see discussion here|http://mail.openjdk.java.net/pipermail/nio-dev/2015-May/003140.html].
> [~JoshuaMcKenzie]: assigning to you for now, just so it's tracked by the Windows overlord.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message