From user-return-64352-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Fri Aug 16 14:13:52 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id AE24918057A for ; Fri, 16 Aug 2019 16:13:52 +0200 (CEST) Received: (qmail 37491 invoked by uid 500); 16 Aug 2019 14:13:49 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 37481 invoked by uid 99); 16 Aug 2019 14:13:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Aug 2019 14:13:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 2ABBCC0154 for ; Fri, 16 Aug 2019 14:13:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.801 X-Spam-Level: * X-Spam-Status: No, score=1.801 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=zalando.de Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id WzsXtcVqC66g for ; Fri, 16 Aug 2019 14:13:46 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.167.45; helo=mail-lf1-f45.google.com; envelope-from=oleksandr.shulgin@zalando.de; receiver= Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id B52D6BC7E2 for ; Fri, 16 Aug 2019 14:13:45 +0000 (UTC) Received: by mail-lf1-f45.google.com with SMTP id b17so4149689lff.7 for ; Fri, 16 Aug 2019 07:13:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zalando.de; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=MyXpfVGNTH3RQVuDekuBwbYodJLrVdfXRpSax5hOzRY=; b=I5UOpiC90lsYadHCgKnzU66PptAbLMG7C079flhCUp85VpskVDg1/fbhqZus52Iy+F 0cZfnhjclJymHdELRAz2I+/J9+5sFplnD/sQQQhtpCn+ei5ZDJlFe5I/vYlfS74G6Tri +JpFDa7iEeQbdyOLHzmOmpvisV3YC24SgvR5dOStISXFW5SFlIzyiG+kX8HvXQiOK6gu uUkXqW0vkJR+9ooHByw/mOEUspg0uOfG7HHpsO/UarPXbPSTgo5MHgNaTZEgtr0ZCkS5 lOAM1usHqp1Gkw3v5Rt/aoHEn2k4LEzQlADANTxqmUqpSW3AYacUnbWChEH1EsJlEr2q XQdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=MyXpfVGNTH3RQVuDekuBwbYodJLrVdfXRpSax5hOzRY=; b=XL9fbytVx1M4jrba7Kdd1pHy0jnf62nA2I5fXaSF87DLdx1aFLtRWg8XEfgk4u13VB skdupnFUsxesZ3BI6jKkAJQS+lWNo2XVw+rlZUOTJ89tpSZMD/3Hd5jDiKkH12XWH+nx 1q7DaUctkB8TqLGfheAyOqDMQQ86z2fAnzJ9UaK9i9fYFnCcVI6b41FE5IzpY6YK/h8o ek59Ua16656CbZIslknKiyIgZVpB6XJ6+HVACGD4EwJIWEM6pN80m4evLR6ctwE9xUF9 FI/M490d5IGw9wHjTEp9tHO/pCxEHyZF2ISIoIz7PsCLXtu2nSbz/cGZGMN9wS66jJUe PAMA== X-Gm-Message-State: APjAAAWSnQB+C0gr0sxjtspzuY1ZP4O1o60JBuVXFSVpELsBrAHiE0oS d1maj4HXV5p6yH8Lbv7kSKU5PWxwzOWubg5KlX4GTuRMRbQ= X-Google-Smtp-Source: APXvYqyVYw1ljNLsqJZzffd655iMoMZI7r7zijiWih8f39NtlrLqoJuNSgqD8BN0msBRnnSwTEgHqDjI1nhs8gb9cbs= X-Received: by 2002:a19:7414:: with SMTP id v20mr5311628lfe.51.1565964823721; Fri, 16 Aug 2019 07:13:43 -0700 (PDT) MIME-Version: 1.0 References: <53b1d3a7753446099f24cafcd9de9d9f@metricly.com> In-Reply-To: From: Oleksandr Shulgin Date: Fri, 16 Aug 2019 16:13:17 +0200 Message-ID: Subject: Odd number of files on one node during repair (was: To Repair or Not to Repair) To: User Content-Type: multipart/alternative; boundary="0000000000000cc66f05903c99a0" --0000000000000cc66f05903c99a0 Content-Type: text/plain; charset="UTF-8" On Tue, Aug 13, 2019 at 6:14 PM Oleksandr Shulgin < oleksandr.shulgin@zalando.de> wrote: > > I was wondering about this again, as I've noticed one of the nodes in our > cluster accumulating ten times the number of files compared to the average > across the rest of cluster. The files are all coming from a table with > TWCS and repair (running with Reaper) is ongoing. The sudden growth > started around 24 hours ago as the affected node was restarted due to > failing AWS EC2 System check. > And now as the next weekly repair has started, the same node shows the problem again. Number of files went up to 6,000 in the last 7 hours, as compared to the average of ~1,500 on the rest of the nodes, which remains more or less constant. Any advice how to debug it? Regards, -- Alex --0000000000000cc66f05903c99a0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Aug 13, 2019 at 6:14 PM Oleksandr= Shulgin <oleksandr.shul= gin@zalando.de> wrote:
=
I was wondering about this again,= as I've noticed one of the nodes in our cluster accumulating ten times= the number of files compared to the average across the rest of cluster.=C2= =A0 The files are all coming from a table with TWCS and repair (running wit= h Reaper) is ongoing.=C2=A0 The sudden growth started around 24 hours ago a= s the affected node was restarted due to failing AWS EC2 System check.

And now as the next weekly re= pair has started, the same node shows the problem again.=C2=A0 Number of fi= les went up to 6,000 in the last 7 hours, as compared to the average of ~1,= 500 on the rest of the nodes, which remains more or less constant.

Any advice how to debug it?

Regar= ds,
--
Alex

--0000000000000cc66f05903c99a0--