Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Subject: Re: Fwd: Cassandra Load spike
To: user@cassandra.apache.org
References: 
 <CAEWpZH3wPLFh-HF1MC6FrhgAkvsaRkaBg3nUdExuWm73KZO_OA@mail.gmail.com>
 <CAEWpZH2StEnQ_pC8K2EsPkfS9gXj2+r7-4T_hQyMS+46ECS+uQ@mail.gmail.com>
From: Jan Kesten <j.kesten@enercast.de>
Message-ID: <5710981C.20500@enercast.de>
Date: Fri, 15 Apr 2016 09:28:28 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101
 Thunderbird/38.7.2
MIME-Version: 1.0
In-Reply-To: 
 <CAEWpZH2StEnQ_pC8K2EsPkfS9gXj2+r7-4T_hQyMS+46ECS+uQ@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit

Hi,

you should check the "snapshot" directories on your nodes - it is very 
likely there are some old ones from failed operations taking up some space.

Am 15.04.2016 um 01:28 schrieb kavya:
> Hi,
>
> We are running a 6 node cassandra 2.2.4 cluster and we are seeing a 
> spike in the disk Load as per the ‘nodetool status’ command that does 
> not correspond with the actual disk usage. Load reported by nodetool 
> was as high as 3 times actual disk usage on certain nodes.
> We noticed that the periodic repair failed with below error on running 
> the command : ’nodetool repair -pr’
>
> ERROR [RepairJobTask:2] 2016-04-12 15:46:29,902 
> RepairRunnable.java:243 - Repair session 
> 64b54d50-0100-11e6-b46e-a511fd37b526 for range 
> (-3814318684016904396,-3810689996127667017] failed with error [….] 
> Validation failed in /<ip>
> org.apache.cassandra.exceptions.RepairException: [….] Validation 
> failed in <ip>
>     at 
> org.apache.cassandra.repair.ValidationTask.treeReceived(ValidationTask.java:64) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
>     at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:183) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
>     at 
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:410) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
>     at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:163) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
>     at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
> [na:1.8.0_40]
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
> [na:1.8.0_40]
>     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40
>
> We restarted all nodes in the cluster and ran a full repair which 
> completed successfully without any validation errors, however we still 
> see Load spike on the same nodes after a while. Please advice.
>
> Thanks!
>