cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Coli (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3564) flush before shutdown so restart is faster
Date Tue, 04 Sep 2012 21:16:09 GMT


Robert Coli commented on CASSANDRA-3564:

> That sounds reasonable but I think we should have some kind of ceiling (10 minutes or
something) where we kill -9 it, just in case we ever have a bug that causes us not to exit
(we've had them before), so we don't hang the shutdown of the entire machine forever.
> ...
> Unless I'm missing something, you can't do anything about a kill -9, you're cooked.

The trivial case of this is a node where the data directory has been marked read-only due
to errors but the commitlog is on a different device which is still writable.

In the status quo, stopping such a node will not result in the sstable flush blocking forever.
The node just stops. On restart it replays and (CASSANDRA-1967) these replayed memtables are
then flushed. This results in the same flush blocking forever, but the node otherwise serves
reads and can take writes until it OOMs. It also doesn't need to be sent a SIGKILL at any

If the node flushes on shutdown in such a way that it is effectively "drain"ing the node,
then in order to avoid data loss you merely need to wait for the commitlog sync. The relevant
thing seems to be that the *commitlog* is synced before you SIGKILL the node, not whether
the *flush* succeeds or not. In practice, this window seems likely to be sized in a small
number of seconds even with the most lenient commitlog flushing, and is therefore likely irrelevant.

However, with flush-on-shutdown you *have* to send the process SIGKILL in this case, because
the flush can hang indefinitely. I get worried any time I *have* to send SIGKILL to a database,
even if I understand logically that is it safe. Adding flush to the shutdown path seems to
create a new case in which I *have* to do this uncomfortable thing.
> That is to say, with the status quo, if you want to flush before shutdown, you call nodetool
flush. Not a big deal. But if we made it flush-everything-by-default then to make it NOT flush
our options include.

I don't understand why calling nodetool flush/drain is not a big deal here, but is a big deal
enough to special case shutdown when durable_writes is off in CASSANDRA-2958.

In my opinion, the sane default here is the pre-2958 status quo : no flushing on shutdown
ever, including when durable_writes is off. Operators who want to drain nodes before stopping
them can do so via nodetool.
> flush before shutdown so restart is faster
> ------------------------------------------
>                 Key: CASSANDRA-3564
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Packaging
>            Reporter: Jonathan Ellis
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 1.2.0
>         Attachments: 3564.patch, 3564.patch
> Cassandra handles flush in its shutdown hook for durable_writes=false CFs (otherwise
we're *guaranteed* to lose data) but leaves it up to the operator otherwise.  I'd rather leave
it that way to offer these semantics:
> - cassandra stop = shutdown nicely [explicit flush, then kill -int]
> - kill -INT = shutdown faster but don't lose any updates [current behavior]
> - kill -KILL = lose most recent writes unless durable_writes=true and batch commits are
on [also current behavior]
> But if it's not reasonable to use nodetool from the init script then I guess we can just
make the shutdown hook flush everything.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message