we currently have setcompactionthroughput set to 0. we are also running on ssds.
even so, the compaction falls behind.
what kind of compaction throughput should we be seeing with no threshold limit? I am usually seeing around 2-3MB/s in the system.log, which seems super slow to me.
(sorry for short response, but currently replying from phone)
Generally the main knob for compaction performance is compaction_throughput_in_mb in cassandra.yaml. It defaults to 16. You can use nodetool setcompactionthroughput' to set it on a running server. The next time Cassandra server starts it will use what's in the yaml again. You might try using nodetool to set compactionthroughput to different values to see if that helps. Generally you want to keep compaction throughput high enough so that you don't get behind but low enough to not adversely affect read/write latency.
multithreaded_compaction is meant for special circumstances where you have extra disk IO laying around, such as when you're running Cassandra on SSDs. Some people have run it and had no problem. However there are a few open issues with it, which is probably where "unstable" came from. I would stick with the compaction throughput setting.
On Sep 15, 2012, at 7:57 PM, Alexander N. Spitzer <email@example.com> wrote:
> We have a cluster using leveled compaction and there are only a couple
> CF. The cluster does not seem to be able to keep up with compaction.
> When running "top", I always see core that is 100% busy, which I think
> is most likely the compaction thread.
> I wanted to enable multithreaded_compaction, but someone told me it
> was "unstable". Does anyone have any experience with this parameter?
> p.s. I am new to cassandra, so sorry if this is a silly question.
> -alex spitzer
> Cell: 617.407.2274
> AIM: AlexSpitzer
> GChat: firstname.lastname@example.org