cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eran Chinthaka Withana <eran.chinth...@gmail.com>
Subject Re: Few Clarifications on Major Compactions
Date Thu, 01 Mar 2012 05:39:21 GMT
Thanks Maki and Tyler.

Re: Q1: I think its the time for me to think about LevelCompaction at this
time. But I'm happy to know I can run major compactions as often as I like
if I can afford.

Re: Q2: Other than the high IO impact if there won't be any data
corruption/consistency issues I think I can afford this too.

Thanks,
Eran Chinthaka Withana


On Wed, Feb 29, 2012 at 7:17 PM, Tyler Hobbs <tyler@datastax.com> wrote:

> At this point, using LeveledCompaction is a much better way to have good
> guarantees about how many sstables your reads will hit (and thus better
> latency guarantees) than SizeTiered with periodic major compactions.
>
>
> On Wed, Feb 29, 2012 at 8:49 PM, Maki Watanabe <watanabe.maki@gmail.com>wrote:
>
>> DataStax has not recommend to run major compaction now:
>>  http://www.datastax.com/docs/1.0/operations/tuning
>> But if you can afford it, major compaction will improve read latency as
>> you see.
>>
>> Major compaction is expensive, so you will not want to run it during
>> high traffic hours. And you should not run it more than 1 node in
>> replicas same time. You should not run repair and major compaction in
>> same time in same (affected) node, because both of the tasks require
>> massive io.
>> With these constraints, as often as you run major compaction, you will
>> get better read latency.
>>
>> 2012/3/1 Eran Chinthaka Withana <eran.chinthaka@gmail.com>:
>> > Hi,
>> >
>> > I have two questions on major compactions (the ones user initiate using
>> > nodetool) and I really appreciate if someone can help.
>> >
>> > 1. I've noticed that when I run compactions the read latency improves
>> even
>> > more than I expected (which is good :) ) The improvement is so tempting
>> that
>> > I'd like to run this almost every week :). I understand after a
>> compaction
>> > Cassandra will create one giant SSTable and if something happens to it
>> > things can go little bit crazy. So from your experience how often
>> should we
>> > be running compactions? What parameters will influence this frequency?
>> >
>> > 2. I'm thinking scheduling compactions using a cron job. But the issue
>> is I
>> > scheduled repairs also using a cronjob to run once in GC Period (of
>> default
>> > 10 days). Now the obvious question is what will happen if a node is
>> running
>> > both the compactions AND the repair at the same time? Is this something
>> we
>> > should avoid at all costs? What will be the implications?
>> >
>> > Thanks,
>> > Eran Chinthaka Withana
>> >
>>
>>
>>
>> --
>> w3m
>>
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>
>

Mime
View raw message