cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philip Thompson (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data
Date Fri, 09 Jan 2015 16:14:35 GMT


Philip Thompson updated CASSANDRA-8523:
    Fix Version/s: 2.1.3
       Issue Type: Improvement  (was: Bug)

I've checked with [~brandon.williams], and this is intended behavior, so I'm marking this
as an Improvement, not a bug. Do note that the replacement node will receive all writes while
it was streaming that were captured as hints, so if the stream takes less than the hint window,
you should not see too much discrepancy.

> Writes should be sent to a replacement node while it is streaming in data
> -------------------------------------------------------------------------
>                 Key: CASSANDRA-8523
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Richard Wagner
>             Fix For: 2.0.12, 2.1.3
> In our operations, we make heavy use of replace_address (or replace_address_first_boot)
in order to replace broken nodes. We now realize that writes are not sent to the replacement
nodes while they are in hibernate state and streaming in data. This runs counter to what our
expectations were, especially since we know that writes ARE sent to nodes when they are bootstrapped
into the ring.
> It seems like cassandra should arrange to send writes to a node that is in the process
of replacing another node, just like it does for a nodes that are bootstraping. I hesitate
to phrase this as "we should send writes to a node in hibernate" because the concept of hibernate
may be useful in other contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period makes subsequent
repairs more expensive, proportional to the number of writes that we miss (and depending on
the amount of data that needs to be streamed during replacement and the time it may take to
rebuild secondary indexes, we could miss many many hours worth of writes). It also leaves
us more exposed to consistency violations.

This message was sent by Atlassian JIRA

View raw message