Datastax manual suggests that during a Cassandra cluster expansion, an administrator has to run nodetool cleanup on each of the previously existing Cassandra nodes to remove the keys that are no longer belonging to those nodes. Further the manual says that the nodetool cleanup task should be run sequentially on the existing Cassandra nodes.
Here is my problem: I have a very large Cassandra cluster with 100s of nodes and running nodetool cleanup sequentially will take a long time to finish.
Questions: a) So can someone tell me about the implications of running the nodetool cleanup concurrently on the entire cluster ?
b) Will Cassandra automatically take care of removing obsolete keys in future ?