cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From brian.spind...@gmail.com
Subject Re: TWCS Compaction backed up
Date Wed, 08 Aug 2018 00:16:41 GMT
Everything is ttl’d 

I suppose I could use sstablemeta to see the repaired bit, could I just set that to unrepaired
somehow and that would fix? 

Thanks!

> On Aug 7, 2018, at 8:12 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
> 
> May be worth seeing if any of the sstables got promoted to repaired - if so they’re
not eligible for compaction with unrepaired sstables and that could explain some higher counts
> 
> Do you actually do deletes or is everything ttl’d?
>  
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Aug 7, 2018, at 5:09 PM, Brian Spindler <brian.spindler@gmail.com> wrote:
>> 
>> Hi Jeff, mostly lots of little files, like there will be 4-5 that are 1-1.5gb or
so and then many at 5-50MB and many at 40-50MB each.   
>> 
>> Re incremental repair; Yes one of my engineers started an incremental repair on this
column family that we had to abort.  In fact, the node that the repair was initiated on ran
out of disk space and we ended replacing that node like a dead node.   
>> 
>> Oddly the new node is experiencing this issue as well.  
>> 
>> -B
>> 
>> 
>>> On Tue, Aug 7, 2018 at 8:04 PM Jeff Jirsa <jjirsa@gmail.com> wrote:
>>> You could toggle off the tombstone compaction to see if that helps, but that
should be lower priority than normal compactions
>>> 
>>> Are the lots-of-little-files from memtable flushes or repair/anticompaction?
>>> 
>>> Do you do normal deletes? Did you try to run Incremental repair?  
>>> 
>>> -- 
>>> Jeff Jirsa
>>> 
>>> 
>>>> On Aug 7, 2018, at 5:00 PM, Brian Spindler <brian.spindler@gmail.com>
wrote:
>>>> 
>>>> Hi Jonathan, both I believe.  
>>>> 
>>>> The window size is 1 day, full settings: 
>>>>     AND compaction = {'timestamp_resolution': 'MILLISECONDS', 'unchecked_tombstone_compaction':
'true', 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval':
'86400', 'tombstone_threshold': '0.2', 'class': 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'}

>>>> 
>>>> 
>>>> nodetool tpstats 
>>>> 
>>>> Pool Name                    Active   Pending      Completed   Blocked  All
time blocked
>>>> MutationStage                     0         0    68582241832         0  
              0
>>>> ReadStage                         0         0      209566303         0  
              0
>>>> RequestResponseStage              0         0    44680860850         0  
              0
>>>> ReadRepairStage                   0         0       24562722         0  
              0
>>>> CounterMutationStage              0         0              0         0  
              0
>>>> MiscStage                         0         0              0         0  
              0
>>>> HintedHandoff                     1         1            203         0  
              0
>>>> GossipStage                       0         0        8471784         0  
              0
>>>> CacheCleanupExecutor              0         0            122         0  
              0
>>>> InternalResponseStage             0         0         552125         0  
              0
>>>> CommitLogArchiver                 0         0              0         0  
              0
>>>> CompactionExecutor                8        42        1433715         0  
              0
>>>> ValidationExecutor                0         0           2521         0  
              0
>>>> MigrationStage                    0         0         527549         0  
              0
>>>> AntiEntropyStage                  0         0           7697         0  
              0
>>>> PendingRangeCalculator            0         0             17         0  
              0
>>>> Sampler                           0         0              0         0  
              0
>>>> MemtableFlushWriter               0         0         116966         0  
              0
>>>> MemtablePostFlush                 0         0         209103         0  
              0
>>>> MemtableReclaimMemory             0         0         116966         0  
              0
>>>> Native-Transport-Requests         1         0     1715937778         0  
         176262
>>>> 
>>>> Message type           Dropped
>>>> READ                         2
>>>> RANGE_SLICE                  0
>>>> _TRACE                       0
>>>> MUTATION                  4390
>>>> COUNTER_MUTATION             0
>>>> BINARY                       0
>>>> REQUEST_RESPONSE          1882
>>>> PAGED_RANGE                  0
>>>> READ_REPAIR                  0
>>>> 
>>>> 
>>>>> On Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad <jon@jonhaddad.com>
wrote:
>>>>> What's your window size?
>>>>> 
>>>>> When you say backed up, how are you measuring that?  Are there pending
tasks or do you just see more files than you expect?
>>>>> 
>>>>>> On Tue, Aug 7, 2018 at 4:38 PM Brian Spindler <brian.spindler@gmail.com>
wrote:
>>>>>> Hey guys, quick question: 
>>>>>>  
>>>>>> I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit
log on one drive, data on nvme.  That was working very well, it's a ts db and has been accumulating
data for about 4weeks.  
>>>>>> 
>>>>>> The nodes have increased in load and compaction seems to be falling
behind.  I used to get about 1 file per day for this column family, about ~30GB Data.db file
per day.  I am now getting hundreds per day at  1mb - 50mb.
>>>>>> 
>>>>>> How to recover from this? 
>>>>>> 
>>>>>> I can scale out to give some breathing room but will it go back and
compact the old days into nicely packed files for the day?    
>>>>>> 
>>>>>> I tried setting compaction throughput to 1000 from 256 and it seemed
to make things worse for the CPU, it's configured on i3.2xl with 8 compaction threads. 
>>>>>> 
>>>>>> -B
>>>>>> 
>>>>>> Lastly, I have mixed TTLs in this CF and need to run a repair (I
think) to get rid of old tombstones, however running repairs in 2.1 on TWCS column families
causes a very large spike in sstable counts due to anti-compaction which causes a lot of disruption,
is there any other way?  
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Jon Haddad
>>>>> http://www.rustyrazorblade.com
>>>>> twitter: rustyrazorblade

Mime
View raw message