incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From buddhasystem <potek...@bnl.gov>
Subject Re: How bad is teh impact of compaction on performance?
Date Sat, 05 Feb 2011 17:48:04 GMT

Thanks Edward. In our usage scenario, there is never downtime, it's a global
24/7 operation.

What is impacted the worst, the read or write?

How does a node handle compaction when there is a spike of writes coming to
it?



Edward Capriolo wrote:
> 
> On Sat, Feb 5, 2011 at 11:59 AM, buddhasystem <potekhin@bnl.gov> wrote:
>>
>> Just wanted to see if someone with experience in running an actual
>> service
>> can advise me:
>>
>> how often do you run nodetool compact on your nodes? Do you stagger it in
>> time, for each node? How badly is performance affected?
>>
>> I know this all seems too generic but then again no two clusters are
>> created
>> equal anyhow. Just wanted to get a feel.
>>
>> Thanks,
>> Maxim
>>
>> --
>> View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-bad-is-teh-impact-of-compaction-on-performance-tp5995868p5995868.html
>> Sent from the cassandra-user@incubator.apache.org mailing list archive at
>> Nabble.com.
>>
> 
> This is an interesting topic. Cassandra can now remove tombstones on
> non-major compaction. For some use cases you may not have to trigger
> nodetool compact yourself to remove tombstones. Use cases that do not
> to many updates, deletes may have the least need to run compaction
> yourself.
> 
> !However! If you have smaller SSTables, or less SSTables your read
> operations will be more efficient.
> 
> if you have downtime such as from 1AM-6AM. Going through a major
> compaction might shrink you dataset significantly and that will make
> reads better.
> 
> Compaction can be more or less intensive. The largest factor is is row
> size.  Users with large rows probably see faster compaction while
> smaller rows see it take a long time. You can lower the priority of
> the compaction thread for experimentation.
> 
> As to the performance you want to get your cluster to the state where
> it is not compacting often. This may mean you need more nodes to
> handle writes.
> 
> I graph the compaction information from JMX
> http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp
> to get a feel for how often a node is compacting on average. Also I
> cross reference the compaction with Read latency and IO graphs I have
> to see what impact compaction has on reads.
> 
> Forcing a major compaction also lowers the chances a compaction will
> happen during the day on peak time. I major compact a few cluster
> nodes each night through cron (gc time 3 days). This has been good for
> keeping our data on disk as small as possible. Forcing the major
> compact at night uses IO, but i find it saves IO over the course of
> the day because each read seeks less on disk.
> 
> 

-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-bad-is-the-impact-of-compaction-on-performance-tp5995868p5995978.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Mime
View raw message