Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 76162200BB9 for ; Mon, 7 Nov 2016 23:55:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 74B7B160AEC; Mon, 7 Nov 2016 22:55:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BC089160AE0 for ; Mon, 7 Nov 2016 23:54:59 +0100 (CET) Received: (qmail 97484 invoked by uid 500); 7 Nov 2016 22:54:58 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 97473 invoked by uid 99); 7 Nov 2016 22:54:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Nov 2016 22:54:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7C1332C2A69 for ; Mon, 7 Nov 2016 22:54:58 +0000 (UTC) Date: Mon, 7 Nov 2016 22:54:58 +0000 (UTC) From: "Kurt Greaves (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-12730) Thousands of empty SSTables created during repair - TMOF death MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 07 Nov 2016 22:55:00 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645739#comment-15645739 ] Kurt Greaves commented on CASSANDRA-12730: ------------------------------------------ For the record, it seems to me like this is a potentially harmless optimisation, but doesn't really fix the core issue. I've seen similar issues on a lot of clusters on a lot of different versions (pretty much every version since 2.1.11). This issue occurs quite frequently and not necessarily on clusters under heap pressure. Usually when we see this issue the logs report masses of streaming sessions and flushes. The proposed solution seems to be simple and harmless enough even though it may not fix the real issue and I think would help in cases where there are a small amount of constant writes to a table such that the memtable is never/rarely clean when the flush is triggered, resulting in lots of small SSTables. I think the real issue is a problem elsewhere that triggers lots of stream sessions during the repair even though the data is consistent. We've seen this sort of thing in cases where we're fairly sure there have been no down nodes and no dropped mutations, and even running repairs continuously results in many streams. > Thousands of empty SSTables created during repair - TMOF death > -------------------------------------------------------------- > > Key: CASSANDRA-12730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12730 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths > Reporter: Benjamin Roth > Priority: Critical > > Last night I ran a repair on a keyspace with 7 tables and 4 MVs each containing a few hundret million records. After a few hours a node died because of "too many open files". > Normally one would just raise the limit, but: We already set this to 100k. The problem was that the repair created roughly over 100k SSTables for a certain MV. The strange thing is that these SSTables had almost no data (like 53bytes, 90bytes, ...). Some of them (<5%) had a few 100 KB, very few (<1% had normal sizes like >= few MB). I could understand, that SSTables queue up as they are flushed and not compacted in time but then they should have at least a few MB (depending on config and avail mem), right? > Of course then the node runs out of FDs and I guess it is not a good idea to raise the limit even higher as I expect that this would just create even more empty SSTables before dying at last. > Only 1 CF (MV) was affected. All other CFs (also MVs) behave sanely. Empty SSTables have been created equally over time. 100-150 every minute. Among the empty SSTables there are also Tables that look normal like having few MBs. > I didn't see any errors or exceptions in the logs until TMOF occured. Just tons of streams due to the repair (which I actually run over cs-reaper as subrange, full repairs). > After having restarted that node (and no more repair running), the number of SSTables went down again as they are compacted away slowly. > According to [~zznate] this issue may relate to CASSANDRA-10342 + CASSANDRA-8641 -- This message was sent by Atlassian JIRA (v6.3.4#6332)