Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D852819091 for ; Fri, 15 Apr 2016 07:28:40 +0000 (UTC) Received: (qmail 53221 invoked by uid 500); 15 Apr 2016 07:28:38 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 53172 invoked by uid 500); 15 Apr 2016 07:28:38 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 53162 invoked by uid 99); 15 Apr 2016 07:28:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2016 07:28:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D8F5CC0B94 for ; Fri, 15 Apr 2016 07:28:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.096 X-Spam-Level: X-Spam-Status: No, score=-1.096 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-0.996] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=enercast.de Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id B52ToAQfs3ut for ; Fri, 15 Apr 2016 07:28:36 +0000 (UTC) Received: from team.enercast.de (node02.enercast.de [88.198.227.105]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id AB1A55F396 for ; Fri, 15 Apr 2016 07:28:35 +0000 (UTC) Received: from [172.17.3.25] (unknown [80.69.206.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by team.enercast.de (Postfix) with ESMTPSA id 5D2012F22E48 for ; Fri, 15 Apr 2016 07:28:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=enercast.de; s=mail; t=1460705309; bh=B5ZdvUWGDJCHM6Im1OMHYkQ1L/i+GgifkfssbVLFxH4=; h=Subject:To:References:From:Date:In-Reply-To:From; b=oxW8qb1wNH9obh5WCJvSY2/fPvcZn5SdmxBN+F4foLZlrmOHnmzQmqSvt9XepXgHN Joqqp29Ebuw1V8v0/E8NNO2LhedKkXNr8e/Sy4wc4FyPSVZts/VIbKUaW8SovO2Lfe e0toGZaot4tmoddeQzq3cxrobig/nN8dSamFaXK8= Subject: Re: Fwd: Cassandra Load spike To: user@cassandra.apache.org References: From: Jan Kesten Message-ID: <5710981C.20500@enercast.de> Date: Fri, 15 Apr 2016 09:28:28 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Hi, you should check the "snapshot" directories on your nodes - it is very likely there are some old ones from failed operations taking up some space. Am 15.04.2016 um 01:28 schrieb kavya: > Hi, > > We are running a 6 node cassandra 2.2.4 cluster and we are seeing a > spike in the disk Load as per the ‘nodetool status’ command that does > not correspond with the actual disk usage. Load reported by nodetool > was as high as 3 times actual disk usage on certain nodes. > We noticed that the periodic repair failed with below error on running > the command : ’nodetool repair -pr’ > > ERROR [RepairJobTask:2] 2016-04-12 15:46:29,902 > RepairRunnable.java:243 - Repair session > 64b54d50-0100-11e6-b46e-a511fd37b526 for range > (-3814318684016904396,-3810689996127667017] failed with error [….] > Validation failed in / > org.apache.cassandra.exceptions.RepairException: [….] Validation > failed in > at > org.apache.cassandra.repair.ValidationTask.treeReceived(ValidationTask.java:64) > ~[apache-cassandra-2.2.4.jar:2.2.4] > at > org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:183) > ~[apache-cassandra-2.2.4.jar:2.2.4] > at > org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:410) > ~[apache-cassandra-2.2.4.jar:2.2.4] > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:163) > ~[apache-cassandra-2.2.4.jar:2.2.4] > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) > ~[apache-cassandra-2.2.4.jar:2.2.4] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_40] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_40] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40 > > We restarted all nodes in the cluster and ran a full repair which > completed successfully without any validation errors, however we still > see Load spike on the same nodes after a while. Please advice. > > Thanks! >