Hi Paulo,

I just completed a migration from 1.1.10 to 1.2.10 and it was surprisingly painless. 

The course of action that I took:
1) describe cluster - make sure all nodes are on the same schema
2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is going to kick off in the middle of what you're doing
3) snapshot - maybe not necessary but it's so quick it makes no sense to skip this step
4) drain the nodes - I shut down the entire cluster rather than chance any incompatible gossip concerns that might come from a rolling upgrade. I have the luxury of controlling both the providers and consumers of our data, so this wasn't so disruptive for us.
5) Upgrade the nodes, turn them on one-by-one, monitor the logs for funny business.
6) nodetool upgradesstables
7) Turn various maintenance tasks back on, etc.

The worst part was managing the yaml/config changes between the versions. It wasn't horrible, but the diff was "noisier" than a more incremental upgrade typically is. A few things I recall that were special:
1) Since you have an existing cluster, you'll probably need to set the default partitioner back to RandomPartitioner in cassandra.yaml. I believe that is outlined in NEWS. 
2) I set the initial tokens to be the same as what the nodes held previously. 
3) The timeout is now divided into more atomic settings and you get to decided how (or if) to configure it from the default appropriately.

tldr; I did a standard upgrade and payed careful attention to the NEWS.txt upgrade notices. I did a full cluster restart and NOT a rolling upgrade. It went without a hitch.


On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta <pauloricardomg@gmail.com> wrote:
Cool, sounds fair enough. Thanks for the help, Rob!

If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to share any tips on issues you're encountered that are not yet documented.



2013/9/24 Robert Coli <rcoli@eventbrite.com>
On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta <pauloricardomg@gmail.com> wrote:
Doesn't the probability of something going wrong increases as the gap between the versions increase? So, using this reasoning, upgrading from 1.1.10 to 1.2.6 would have less chance of something going wrong then from 1.1.10 to 1.2.9 or 1.2.10.

Sorta, but sorta not. 

Is the canonical source of concerns on upgrade. There are a few cases where upgrading to the "root" of X.Y.Z creates issues that do not exist if you upgrade to the "head" of that line. AFAIK there have been no cases where upgrading to the "head" of a line (where that line is mature, like 1.2.10) has created problems which would have been avoided by upgrading to the "root" first.
I'm hoping this reasoning is wrong and I can update directly from 1.1.10 to 1.2.10. :-)

That's what I plan to do when we move to 1.2.X, FWIW.


Paulo Ricardo

European Master in Distributed Computing
Royal Institute of Technology - KTH
Instituto Superior Técnico - IST