cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Will Martin <w...@voodoolunchbox.com>
Subject Re: RF update
Date Tue, 16 Oct 2012 01:32:30 GMT
+1   It doesn't make sense that the xfr compactions are heavy unless they are translating the
file. This could be a protocol mismatch: however the requirements for node level compaction
and wire compaction I would expect to be pretty different.
On Oct 15, 2012, at 4:42 PM, Matthias Broecheler wrote:

> Hey,
> 
> we are writing a lot of data into a cassandra cluster for a batch loading use case. We
cannot use the sstable batch loader, so in order to speed up the loading process we are using
RF=1 while the data is loading. After the load is complete, we want to increase the RF. For
that, we are updating the RF in the schema and then run the node repair tool on each cassandra
instance to stream the data over. However, we are noticing that this process is slowed down
by a lot of compactions (the actually streaming of data only takes a couple of minutes).
> 
> Cassandra is already running a major compaction after the data loading process has completed.
But then, there are to be two more compactions (one on the sender and one on the receiver)
happening and those take a very long time even on the aws high i/o instance with no compaction
throttling. 
> 
> Question: These additional compactions seem redundant since there are no reads or writes
on the cluster after the first major compaction (immediately after the data load), is that
right? And if so, what can we do to avoid them? We are currently waiting multiple days.
> 
> Thank you very much for your help,
> Matthias
> 


Mime
View raw message