cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane Legay <sle...@looplogic.com>
Subject Re: sstable usage doubles after repair
Date Thu, 20 Nov 2014 18:29:46 GMT
Thanks for the response.

Yes, I went through 1.1, 1.2, 2.0 as rolling updates (entire cluster for
each minor version) and ran upgradesstables each time.
Yes, nodes are using the same tokens. I can see the tokens when running
nodetool ring. They're consistent with what we used to have.
Repairs were very infrequent because we do not delete data. Once every 2 or
3 months with a forced repair if a node ever went down for a period of time
greater than a few hours.

I'll keep an eye on the number of repaired rows.

How should I go about inspecting SSTables?

Thanks again.



On Thu, Nov 20, 2014 at 11:15 AM, Robert Coli <rcoli@eventbrite.com> wrote:

> On Thu, Nov 20, 2014 at 8:36 AM, Stephane Legay <slegay@looplogic.com>
> wrote:
>
>> I upgraded a 2 node cluster with RF = 2  from 1.0.9 to 2.0.11. I did
>> rolling upgrades and upgradesstables after each upgrade.
>>
>
> To be clear, did you go through 1.1, and 1.2, or did you go directly from
> 1.0 to 2.0?
>
>
>> We then moved our data to new hardware by shutting down each node, moving
>> data to new machine, and starting up with auto_bootstrap = false.
>>
>
> This should not be implicated, especially if you verified the upgraded
> nodes came up with the same tokens they had before.
>
>
>> When all was done I ran a repair. Data went from 250GB to 400 GB per
>> node. A week later, I am doing another repair, data filling the 800GB drive
>> on each machine. Huge compaction on each node, constantly.
>>
>
> How frequently had you been running repair in 1.0.9? How often do you
> DELETE?
>
>
>> Where should I go from here? Will scrubbing fix the issue?
>>
>
> I would inspect the newly created SSTables from a repair and see what they
> contain. I would also look at log lines which indicate how many rows are
> being repaired, with a special eye towards whether the number of rows
> repaired each time you repair is decreasing.
>
> Also note that repair in 2.0 is serial by default, you probably want the
> old behavior, which you can get with "-par" flag.
>
> =Rob
> http://twitter.com/rcolidba
>



-- 
Stephane Legay
Co-founder and CTO
LoopLogic, LLC

slegay@looplogic.com
480-326-4080

Mime
View raw message