cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anuj Wadehra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9146) Ever Growing sstables after every Repair
Date Sat, 06 Jun 2015 19:51:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575904#comment-14575904
] 

Anuj Wadehra commented on CASSANDRA-9146:
-----------------------------------------

I am observing that even if vnodes are not damaged and you get all logs saying "Endpoints
/x.x.x.x and /x.x.x.y are consistent for <CF>" then also you get hundreds of tiny sstables
created during repair. This is happeninng for a wide row CF. Why Repair is writing sstables
when nothing was out of sync?

Yes, we will upgrade to 2.0.14 soon but upgrading production systems takes time. And these
thousands of tiny sstable issue combined with  https://issues.apache.org/jira/browse/CASSANDRA-8885
sstable coldness issue in 2.0.3 is creating problem. Any workaround till we upgrade?

> Ever Growing sstables after every Repair
> ----------------------------------------
>
>                 Key: CASSANDRA-9146
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Anuj Wadehra
>         Attachments: sstables.txt, system-modified.log
>
>
> Cluster has reached a state where every "repair -pr" operation on CF results in numerous
tiny sstables being flushed to disk.  Due to thousands of sstables, reads have started timing
out. Even though compaction begins for one of the secondary index, sstable count after repair
remains very high (thousands). Every repair adds thousands of sstables.
> Problems:
> 1. Why burst of tiny tables are flushed during repair ? What is triggering frequent/premature
flush of  sstable (more than hundred in every burst)? At max we see one ParNew GC pauses >200ms.
> 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885)
where compaction doesn't works even when cold_reads_to_omit=0 by default? 
>    If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction
but reads timeout as sstable count is in thousands
> 3. What's the way out if we face this issue in Prod?
> Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641,
but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. 
> Configuration:
> Compaction Strategy: STCS
> memtable_flush_writers=4
> memtable_flush_queue_size=4
> in_memory_compaction_limit_in_mb=32
> concurrent_compactors=12



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message