If the network and both DC's can handle the load it's fine (the new DC would . You'll want to keep an eye on the logs for streaming failures, as it's not always completely clear and you could end up with missing data. You should definitely be aware that rebuilds affect the source DC, so if it's under load you want to be careful of impacting it.

I'm not sure that memtable_cleanup_threshold affects streamed SSTables, seems unlikely that the streamed SSTables would also be added to memtables, however obviously your DC would be receiving writes simultaneously. 0.7 seems quite high, what are your heap settings and memtable_flush_writers?

On 2 November 2016



I am trying to rebuild a new Data Center with 50 Nodes, and expect 1 TB / node. Nodes are backed by SSDs, and the rebuild is happening from another DC in same physical region. This is with 2.1.13.


I am doing this with stream_throughput=200 MB, concurrent_compactors=256, compactionthroughput=0, and memtable_cleanup_threshold=0.7. (memtable setting was necessary to keep # SSTable files in check) and running rebuild 20 nodes at a time.


Have people generally attempted to do such large rebuilds ? Any tips ?


Thanks !