cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13441) Schema version uses built-in digest which includes timestamps, causing migration storms
Date Fri, 14 Apr 2017 21:05:41 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeff Jirsa updated CASSANDRA-13441:
-----------------------------------
    Reviewer: Aleksey Yeschenko

> Schema version uses built-in digest which includes timestamps, causing migration storms
> ---------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13441
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13441
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Schema
>            Reporter: Jeff Jirsa
>            Assignee: Jeff Jirsa
>             Fix For: 4.x
>
>
> In versions < 3.0, schema was essentially deterministic - a given schema always hashed
to the same version, so during a rolling upgrade (say 2.0 -> 2.1), the first node to upgrade
to 2.1 would add the new tables, setting the new 2.1 version ID, and subsequently upgraded
hosts would settle on that version.
> In 3.0, we delegate the digest calculation to the post-8099 data structures, which are
the same digest calculators used in the read path for digest match/mismatch - which means
it includes timestamps (and ttls).
> Since schema will never use TTL, we don't care about TTL fields. Similarly, when a 3.0
node upgrades and writes its own new-in-3.0 system tables, it'll write the same tables that
exist in the schema with brand new timestamps. As written, this will cause all nodes in the
cluster to change schema (to the version with the newest timestamp), and then change a second
time as the non-system schema is propagated to the newly upgraded nodes.
> On a sufficiently large cluster with a non-trivial schema, this could cause (literally)
millions of migration tasks to needlessly bounce across the cluster.
> Up for discussion: if we fix this in 3.0 (say 3.0.X where X >= 14), then any 3.0 node
below this will always mismatch, and cause ping-ponging described in CASSANDRA-11050 . However,
if we don't fix it, we create a situation that's potentially an outage on rolling upgrade.
I'm leaning towards a strong warning in NEWS about the right way to upgrade, and fixing it
in 4.x, but wouldn't mind hearing opinions from [~slebresne] and [~iamaleksey] and [~amorton]
since you three already talked about this on CASSANDRA-11050 . 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message