cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jji...@gmail.com>
Subject Re: Version Rollback
Date Tue, 27 Feb 2018 17:47:40 GMT
MOST minor versions support rollback - the exceptions are those where internode protocol changes
(3.0.14 being the only one in recent memory), or where sstable format changes (again rare).
No major versions support rollback - the only way to do it is to upgrade in a way that you
can effectively reinstall the old version without data loss.

The steps usually look like:

Test in a lab
Test in a lab again
Test in a lab a few more times
Snapshot everything 

If you have a passive data center:
- upgrade one instance
- check to see if it’s happy
- upgrade another
- check to see if it’s happy
- continue until the passive dc is done
- if at any point they’re unhappy rebuild (wipe and restream the old version) the dc from
the active dc

On the active DCs, you’ll want to canary it one replica at a time so you can treat a failed
upgrade like a bad disk:
- upgrade one instance
- check if it’s happy; if it’s not treat it like a failed disk and replace it with the
old version
- if you’re using single token, do another instance in a different replica set, repeat until
you’re out of different replicas. 
- if you’re using vnodes but a rack aware snitch and have more racks than your RF, do another
instance in the same rack as the canary, repeat until you’re out of instances in that rack

This is typically your point of no return - as soon as you have two replicas in the new version
there’s no more rollback practical. 



-- 
Jeff Jirsa


> On Feb 27, 2018, at 9:22 AM, Carl Mueller <carl.mueller@smartthings.com> wrote:
> 
> My speculation is that IF (bigif) the sstable formats are compatible between the versions,
which probably isn't the case for major versions, then you could drop back. 
> 
> If the sstables changed format, then you'll probably need to figure out how to rewrite
the sstables in the older format and then sstableloader them in the older-version cluster
if need be. Alas, while there is an sstable upgrader, there isn't a downgrader AFAIK. 
> 
> And I don't have an intimate view of version-by-version sstable format changes and compatibilities.
You'd probably need to check the upgrade instructions (which you presumably did if you're
upgrading versions) to tell.
> 
> Basically, version rollback is pretty unlikely to be done.
> 
> The OTHER option:
> 
> 1) build a new cluster with the new version, no new data. 
> 
> 2) code your driver interfaces to interface with both clusters. Write to both, but read
preferentially from the new, then fall through to the old. Yes, that gets hairy on multiple
row queries. Port your data with sstable loading from the old to the new gradually. 
> 
> When you've done a full load of all the data from old to new, and you're satisfied with
the new cluster stability, retire the old cluster.
> 
> For merging two multirow sets you'll probably need your multirow queries to return the
partition hash value (or extract the code that generates the hash), and have two simulaneous
java-driver ResultSets going, and merge their results, providing the illusion of a single
database query. You'll need to pay attention to both the row key ordering and column key ordering
to ensure the combined results are properly ordered.
> 
> Writes will be slowed by the double-writes, reads you'll be bound by the worse performing
cluster.
> 
>> On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman <kenbrotman@yahoo.com.invalid>
wrote:
>> Could you tell us the size and configuration of your Cassandra cluster?
>> 
>>  
>> 
>> Kenneth Brotman
>> 
>>  
>> 
>> From: shalom sagges [mailto:shalomsagges@gmail.com] 
>> Sent: Tuesday, February 27, 2018 6:19 AM
>> To: user@cassandra.apache.org
>> Subject: Version Rollback
>> 
>>  
>> 
>> Hi All,
>> 
>> I'm planning to upgrade my C* cluster to version 3.x and was wondering what's the
best way to perform a rollback if need be.
>> 
>> If I used snapshot restoration, I would be facing data loss, depends when I took
the snapshot (i.e. a rollback might be required after upgrading half the cluster for example).
>> 
>> If I add another DC to the cluster with the old version, then I could point the apps
to talk to that DC if anything bad happens, but building it is really time consuming and requires
a lot of resources.
>> 
>> Can anyone provide recommendations on this matter? Any ideas on how to make the upgrade
foolproof, or at least "really really safe"?
>> 
>>  
>> 
>> Thanks!
>> 
>>  
>> 
> 

Mime
View raw message