cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anuj (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
Date Thu, 09 Apr 2015 18:24:15 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anuj updated CASSANDRA-9146:
----------------------------
    Attachment: sstables.txt
                system-modified.log

Please find attached the logs:
1. system-modified.log = system logs
2. sstables.txt = listing of sstables in ks1cf1 column family in test_ks1 Keyspace

Repair -pr was running on node on 3 instances everytime creating numerous sstables every second:
2015-04-09 09:14:36 TO 2015-04-09 12:07:28
2015-04-09 14:34 (stopped at 15:07)
2015-04-09 15:11 

While only 42 sstables exist for ks1cf1Idx3 as it was compacting regularly..other two indexes
ks1cf1Idx1 and ks1cf1Idx2 have 8932 sstables.

> Ever Growing Secondary Index sstables after every Repair
> --------------------------------------------------------
>
>                 Key: CASSANDRA-9146
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Anuj
>         Attachments: sstables.txt, system-modified.log
>
>
> Cluster has reached a state where every "repair -pr" operation on CF results in numerous
tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to
thousands of sstables, reads have started timing out. Even though compaction begins for one
of the secondary index, sstable count after repair remains very high (thousands). Every repair
adds thousands of sstables.
> Problems:
> 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering
frequent/premature flush of secondary index sstable (more than hundred in every burst)? At
max we see one ParNew GC pauses >200ms.
> 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885)
where compaction doesn't works even when cold_reads_to_omit=0 by default? 
>    If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction
but reads timeout as sstable count is in thousands
> 3. What's the way out if we face this issue in Prod?
> Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641,
but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. 
> Configuration:
> Compaction Strategy: STCS
> memtable_flush_writers=4
> memtable_flush_queue_size=4
> in_memory_compaction_limit_in_mb=32
> concurrent_compactors=12



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message