From Robert Coli <>
Subject Re: Read inconsistency after backup and restore to different cluster
Date Thu, 14 Nov 2013 21:15:24 GMT
On Thu, Nov 14, 2013 at 12:37 PM, David Laube <> wrote:

> It is almost as if the data only exists on some of the nodes, or perhaps
> the token ranges are dramatically different --again, we are using vnodes so
> I am not exactly sure how this plays into the equation.

The token ranges are dramatically different, due to vnode random token
selection from not setting initial_token, and setting num_tokens.

You can verify this by listing the tokens per physical node in nodetool
gossipinfo or (iirc) nodetool status.

> 5. Copy 1 of the 5 snapshot archives from cluster-A to each of the five
> nodes in the new cluster-B ring.

I don't understand this at all, do you mean that you are using one source
node's data to load each of of the target nodes? Or are you just saying
there's a 1:1 relationship between source snapshots and target nodes to
load into? Unless you have RF=N, using one source for 5 target nodes won't

To do what I think you're attempting to do, you have basically two options.

1) don't use vnodes and do a 1:1 copy of snapshots
2) use vnodes and
   a) get a list of tokens per node from the source cluster
   b) put a comma delimited list of these in initial_token in
cassandra.yaml on target nodes
   c) probably have to un-set num_tokens (this part is unclear to me, you
will have to test..)
   d) set auto_bootstrap:false in cassandra.yaml
   e) start target nodes, they will not-bootstrap into the same ranges as
the source cluster
   f) load schema / copy data into datadir (being careful of
   g) restart node or use nodetool refresh (I'd probably restart the node
to avoid the bulk rename that refresh does) to pick up sstables
   h) remove auto_bootstrap:false from cassandra.yaml

I *believe* this *should* work, but have never tried it as I do not
currently run with vnodes. It should work because it basically makes
implicit vnode tokens explicit in the conf file. If it *does* work, I'd
greatly appreciate you sharing details of your experience with the list.

General reference on tasks of this nature (does not consider vnodes, but
treat vnodes as "just a lot of physical nodes" and it is mostly relevant) :


