incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Oberman <ober...@civicscience.com>
Subject Re: how to stop out of control compactions?
Date Thu, 04 Apr 2013 19:45:03 GMT
Ah, 0 is the magic?  Odd email thread now.... I asked about the best
practice of disabling compactions, greg said he set threshold = 100000, you
+1'd, I said I couldn't set > 32, and now we're at 0 ;-)

will

On Wed, Apr 3, 2013 at 8:50 PM, aaron morton <aaron@thelastpickle.com>wrote:

>  And it appears I can't set min > 32
>
> Why did you want to set it so high ?
> If you want to disable compaction set it to 0.
>
> Cheers
>
>     -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 2/04/2013, at 8:43 PM, William Oberman <oberman@civicscience.com>
> wrote:
>
> I just tried to use this setting (I'm using 1.1.9).  And it appears I
> can't set min > 32, as that's the max max now (using nodetool at least).
>  Not sure if JMX would allow more access, but I don't like bypassing things
> I don't fully understand.  I think I'll just leave my compaction killers
> running instead (not that killing compactions constantly isn't messing with
> things as well....).
>
> will
>
>
> On Tue, Apr 2, 2013 at 10:43 AM, William Oberman <oberman@civicscience.com
> > wrote:
>
>> Edward, you make a good point, and I do think am getting closer to having
>> to increase my cluster size (I'm around ~300GB/node now).
>>
>> In my case, I think it was neither.  I had one node OOM after working on
>> a large compaction but it continued to run in a zombie like state
>> (constantly GC'ing), which I didn't have an alert on.  Then I had the bad
>> luck of a "close token" also starting a large compaction.  I have RF=3 with
>> some of my R/W patterns at quorum, causing that segment of my cluster to
>> get slow (e.g. a % of of my traffic started to slow).  I was running 1.1.2
>> (I haven't had to poke anything for quite some time, obviously), so I
>> upgraded before moving on (as I saw a lot of bug fixes to compaction issues
>> in release notes).  But the upgrade caused even more nodes to start
>> compactions.  Which lead to my original email... I had a cluster where 80%
>> of my nodes were compacting, and I really needed to boost production
>> traffic and couldn't seem to "tamp cassandra down" temporarily.
>>
>> Thanks for the advice everyone!
>>
>> will
>>
>>
>> On Tue, Apr 2, 2013 at 10:20 AM, Edward Capriolo <edlinuxguru@gmail.com>wrote:
>>
>>> Settings do not make compactions go away. If your compactions are "out
>>> of control" it usually means one of these things,
>>> 1)  you have a corrupt table that the compaction never finishes on,
>>> sstables count keep growing
>>> 2) you do not have enough hardware to handle your write load
>>>
>>>
>>> On Tue, Apr 2, 2013 at 7:50 AM, William Oberman <
>>> oberman@civicscience.com> wrote:
>>>
>>>> Thanks Gregg & Aaron. Missed that setting!
>>>>
>>>> On Tuesday, April 2, 2013, aaron morton wrote:
>>>>
>>>>> Set the min and max
>>>>> compaction thresholds for a given column family
>>>>>
>>>>> +1 for setting the max_compaction_threshold (as well as the min) on
>>>>> the a CF when you are getting behind. It can limit the size of the
>>>>> compactions and give things a chance to complete in a reasonable time.
>>>>>
>>>>> Cheers
>>>>>
>>>>>    -----------------
>>>>> Aaron Morton
>>>>> Freelance Cassandra Consultant
>>>>> New Zealand
>>>>>
>>>>> @aaronmorton
>>>>> http://www.thelastpickle.com
>>>>>
>>>>> On 2/04/2013, at 3:42 AM, Gregg Ulrich <gulrich@netflix.com> wrote:
>>>>>
>>>>> You may want to set compaction threshold and not throughput.  If you
>>>>> set the min threshold to something very large (100000), compactions will
>>>>> not start until cassandra finds this many files to compact (which it
should
>>>>> not).
>>>>>
>>>>> In the past I have used this to stop compactions on a node, and then
>>>>> run an offline major compaction to get though the compaction, then set
the
>>>>> min threshold back.  Not everyone likes major compactions though.
>>>>>
>>>>>
>>>>>
>>>>>   setcompactionthreshold <keyspace> <cfname> <minthreshold>
>>>>> <maxthreshold> - Set the min and max
>>>>> compaction thresholds for a given column family
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 1, 2013 at 12:38 PM, William Oberman <
>>>>> oberman@civicscience.com> wrote:
>>>>>
>>>>>> I'll skip the prelude, but I worked myself into a bit of a jam. 
I'm
>>>>>> recovering now, but I want to double check if I'm thinking about
things
>>>>>> correct.
>>>>>>
>>>>>> Basically, I was in a state where a majority of my servers wanted
to
>>>>>> do compactions, and rather large ones.  This was impacting my site
>>>>>> performance.  I tried nodetool stop COMPACTION.  I tried
>>>>>> setcompactionthroughput=1.  I tried restarting servers, but they'd
restart
>>>>>> the compactions pretty much immediately on boot.
>>>>>>
>>>>>> Then I realized that:
>>>>>> nodetool stop COMPACTION
>>>>>> only stopped running compactions, and then the compactions would
>>>>>> re-enqueue themselves rather quickly.
>>>>>>
>>>>>> So, right now I have:
>>>>>> 1.) scripts running on N-1 servers looping on "nodetool stop
>>>>>> COMPACTION" in a tight loop
>>>>>> 2.) On the "Nth" server I've disabled gossip/thrift and turned up
>>>>>> setcompactionthroughput to 999
>>>>>> 3.) When the Nth server completes, I pick from the remaining N-1
>>>>>> (well, I'm still running the first compaction, which is going to
take 12
>>>>>> more hours, but that is the plan at least).
>>>>>>
>>>>>> Does this make sense?  Other than the fact there was probably warning
>>>>>> signs that would have prevented me from getting into this state in
the
>>>>>> first place? :-)
>>>>>>
>>>>>> will
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>

Mime
View raw message