cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Roth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12730) Thousands of empty SSTables created during repair - TMOF death
Date Fri, 30 Sep 2016 08:34:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535417#comment-15535417
] 

Benjamin Roth commented on CASSANDRA-12730:
-------------------------------------------

BTW: It took a whole work-day (under normal operation conditions) to compact all that tiny
tiny SSTables away after the node crashed. That just show how much impact that puts on compaction.

> Thousands of empty SSTables created during repair - TMOF death
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-12730
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12730
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Benjamin Roth
>            Priority: Critical
>
> Last night I ran a repair on a keyspace with 7 tables and 4 MVs each containing a few
hundret million records. After a few hours a node died because of "too many open files".
> Normally one would just raise the limit, but: We already set this to 100k. The problem
was that the repair created roughly over 100k SSTables for a certain MV. The strange thing
is that these SSTables had almost no data (like 53bytes, 90bytes, ...). Some of them (<5%)
had a few 100 KB, very few (<1% had normal sizes like >= few MB). I could understand,
that SSTables queue up as they are flushed and not compacted in time but then they should
have at least a few MB (depending on config and avail mem), right?
> Of course then the node runs out of FDs and I guess it is not a good idea to raise the
limit even higher as I expect that this would just create even more empty SSTables before
dying at last.
> Only 1 CF (MV) was affected. All other CFs (also MVs) behave sanely. Empty SSTables have
been created equally over time. 100-150 every minute. Among the empty SSTables there are also
Tables that look normal like having few MBs.
> I didn't see any errors or exceptions in the logs until TMOF occured. Just tons of streams
due to the repair (which I actually run over cs-reaper as subrange, full repairs).
> After having restarted that node (and no more repair running), the number of SSTables
went down again as they are compacted away slowly.
> According to [~zznate] this issue may relate to CASSANDRA-10342 + CASSANDRA-8641



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message