I had thought that the topology file is used for replicas placement only such that for the token range that the unknown node is responsible for, data is still read and write there.  It just won't be replicated since replication factor is not defined.

Bill

On Thu, Apr 19, 2012 at 1:18 PM, Richard Lowe <richard.lowe@arkivum.com> wrote:

Yes it is possible. Put the following as the last line of your topology file:

 

default=unknown:unknown

 

So long as you don’t have any DC or rack with this name your local node will not be able to address any nodes that aren’t explicitly given in its topology file.

 

However bear in mind that, whilst Cassandra won’t try to use replication factor to store to these ‘unknown’ nodes, their token may mean that the ‘natural’ home for a row is on a node that is not addressable. This can create holes in your dataset and create situations where data can ‘disappear’ because the bloom filter says the data is on a particular node (due to its token) but the coordinator can’t contact that node to get at the data.

 

Careful use of replication factor and NetworkTopologyStrategy can help with this, but you should make sure that a node really doesn’t need to contact the unknown nodes before marking them as such.

 

 

Richard

 

 

From: Bill Au [mailto:bill.w.au@gmail.com]
Sent: 19 April 2012 17:16
To: user@cassandra.apache.org
Subject: default required in cassandra-topology.properties?

 

All the examples of cassandra-topology.properties that I have seen have a default entry assigning unknown nodes to a specific data center and rack.  Is it possible to have Cassandra ignore unknown nodes for the purpose of replication?

Bill