Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9C4D3200B92 for ; Wed, 28 Sep 2016 17:46:31 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9ACC1160AD3; Wed, 28 Sep 2016 15:46:31 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 12FC4160AB8 for ; Wed, 28 Sep 2016 17:46:29 +0200 (CEST) Received: (qmail 64597 invoked by uid 500); 28 Sep 2016 15:46:28 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 64587 invoked by uid 99); 28 Sep 2016 15:46:28 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Sep 2016 15:46:28 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2701D1A07DE for ; Wed, 28 Sep 2016 15:46:28 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.98 X-Spam-Level: * X-Spam-Status: No, score=1.98 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=thelastpickle-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id PKLwtkF4DEJl for ; Wed, 28 Sep 2016 15:46:25 +0000 (UTC) Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id D863C5FCDE for ; Wed, 28 Sep 2016 15:46:24 +0000 (UTC) Received: by mail-wm0-f46.google.com with SMTP id w84so242750017wmg.1 for ; Wed, 28 Sep 2016 08:46:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thelastpickle-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=tVntZEBtI/WQfeGHjPTJDNuK/i6kKMFBrw/HruX8jQ4=; b=WgG12qhx2x4m5yc8KouZkoOxzAsFHtREOfSswtb+8/wluzzQ1O4/mi8iN/pse9aLO+ A6fzz/ZeFAsbzLmZmCxhO3v9th6S37D3kq2+agBaVKtAJDSSvniPYOoLOHBTZIXkUKve hLC1n70yWkHEuuFC9oISkJUiIODYnWnCGXq0boKu8y9Nv2Nfs9RzImNzYjB2+xy1jKNg YQ3spaj7dJrVa3SE2H33bw6aj/+IHZBon7XIN35FHx6o+1UDe7sbUCT0IK0ABkiBycfj dGjyGS1PjNcki0hap+QN0HPm9biDiTLKqJsLEmbU0l2NaOKGou7lH/y/gYtqhnfJWChY FezQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=tVntZEBtI/WQfeGHjPTJDNuK/i6kKMFBrw/HruX8jQ4=; b=TaEg5TNHD7jRWsz4ckxQt9m04pl3ALhl2/BE0jLlvyJ1LYQvLHw8y79XXMaLJmJzgs 6RGXe7IqFaew5bQgEjGc4x5P5Z9y3tgtTvHNfpCQr1pQljbBtQo8/uHcX/CpHcZU+32l vWuNBNT1nsRmTNsGhZSKWAODZVpWeuuu8tWhCnobEDmKnKozUjV6AA/MdARxSU62QgBy VzJrQRKkKQW1tMfUUtipEotoiJEuyZfDGiqhj5FZ7aG/LViAVaf7wnLDu2p9zsZCtD64 l77Z1BkvWdlxxzzAXSDCOfS39Uw3QH+Uj/Y8wrs5MKIIzEo/OzyGEudd2F05D9mNuUqM SgcQ== X-Gm-Message-State: AA6/9RlEIUiPh/YPB1+zCEgFC2HVCwpW28g1hK0pWcWaiJnOm9LT1K5gcvUQbfyoIaWA9/+6yuzhF4ELZuoPnw== X-Received: by 10.28.46.134 with SMTP id u128mr8675876wmu.21.1475077577832; Wed, 28 Sep 2016 08:46:17 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Alexander Dejanovski Date: Wed, 28 Sep 2016 15:46:07 +0000 Message-ID: Subject: Re: How to get rid of "Cannot start multiple repair sessions over the same sstables" exception To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a11422cae0b2990053d934359 archived-at: Wed, 28 Sep 2016 15:46:31 -0000 --001a11422cae0b2990053d934359 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Robert, You can restart them in any order, that doesn't make a difference afaik. Cheers Le mer. 28 sept. 2016 17:10, Robert Sicoie a =C3=A9crit : > Thanks Alexander, > > Yes, with tpstats I can see the hanging active repair(s) (output > attached). For one there are 31 pending repair. On others there are less > pending repairs (min 12). Is there any recomandation for the restart orde= r? > The one with more less pending repairs first, perhaps? > > Thanks, > Robert > > Robert Sicoie > > On Wed, Sep 28, 2016 at 5:35 PM, Alexander Dejanovski < > alex@thelastpickle.com> wrote: > >> They will show up in nodetool compactionstats : >> https://issues.apache.org/jira/browse/CASSANDRA-9098 >> >> Did you check nodetool tpstats to see if you didn't have any running >> repair session ? >> Just to make sure (and if you can actually do it), roll restart the >> cluster and try again. Repair sessions can get sticky sometimes. >> >> On Wed, Sep 28, 2016 at 4:23 PM Robert Sicoie >> wrote: >> >>> I am using nodetool compactionstats to check for pending compactions an= d >>> it shows me 0 pending on all nodes, seconds before running nodetool rep= air. >>> I am also monitoring PendingCompactions on jmx. >>> >>> Is there other way I can find out if is there any anticompaction runnin= g >>> on any node? >>> >>> Thanks a lot, >>> Robert >>> >>> Robert Sicoie >>> >>> On Wed, Sep 28, 2016 at 4:44 PM, Alexander Dejanovski < >>> alex@thelastpickle.com> wrote: >>> >>>> Robert, >>>> >>>> you need to make sure you have no repair session currently running on >>>> your cluster, and no anticompaction. >>>> I'd recommend doing a rolling restart in order to stop all running >>>> repair for sure, then start the process again, node by node, checking = that >>>> no anticompaction is running before moving from one node to the other. >>>> >>>> Please do not use the -pr switch as it is both useless (token ranges >>>> are repaired only once with inc repair, whatever the replication facto= r) >>>> and harmful as all anticompactions won't be executed (you'll still hav= e >>>> sstables marked as unrepaired even if the process has ran entirely wit= h no >>>> error). >>>> >>>> Let us know how that goes. >>>> >>>> Cheers, >>>> >>>> On Wed, Sep 28, 2016 at 2:57 PM Robert Sicoie >>>> wrote: >>>> >>>>> Thanks Alexander, >>>>> >>>>> Now I started to run the repair with -pr arg and with keyspace and >>>>> table args. >>>>> Still, I got the "ERROR [RepairJobTask:1] 2016-09-28 11:34:38,288 >>>>> RepairRunnable.java:246 - Repair session >>>>> 89af4d10-856f-11e6-b28f-df99132d7979 for range >>>>> [(8323429577695061526,8326640819362122791], >>>>> ..., (4212695343340915405,4229348077081465596]]] Validation failed in= / >>>>> 10.45.113.88" >>>>> >>>>> for one of the tables. 10.45.113.88 is the ip of the machine I am >>>>> running the nodetool on. >>>>> I'm wondering if this is normal... >>>>> >>>>> Thanks, >>>>> Robert >>>>> >>>>> >>>>> >>>>> >>>>> Robert Sicoie >>>>> >>>>> On Wed, Sep 28, 2016 at 11:53 AM, Alexander Dejanovski < >>>>> alex@thelastpickle.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> nodetool scrub won't help here, as what you're experiencing is most >>>>>> likely that one SSTable is going through anticompaction, and then an= other >>>>>> node is asking for a Merkle tree that involves it. >>>>>> For understandable reasons, an SSTable cannot be anticompacted and >>>>>> validation compacted at the same time. >>>>>> >>>>>> The solution here is to adjust the repair pressure on your cluster s= o >>>>>> that anticompaction can end before you run repair on another node. >>>>>> You may have a lot of anticompaction to do if you had high volumes o= f >>>>>> unrepaired data, which can take a long time depending on several fac= tors. >>>>>> >>>>>> You can tune your repair process to make sure no anticompaction is >>>>>> running before launching a new session on another node or you can tr= y my >>>>>> Reaper fork that handles incremental repair : >>>>>> https://github.com/adejanovski/cassandra-reaper/tree/inc-repair-supp= ort-with-ui >>>>>> I may have to add a few checks in order to avoid all collisions >>>>>> between anticompactions and new sessions, but it should be helpful i= f you >>>>>> struggle with incremental repair. >>>>>> >>>>>> In any case, check if your nodes are still anticompacting before >>>>>> trying to run a new repair session on a node. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> >>>>>> On Wed, Sep 28, 2016 at 10:31 AM Robert Sicoie < >>>>>> robert.sicoie@gmail.com> wrote: >>>>>> >>>>>>> Hi guys, >>>>>>> >>>>>>> I have a cluster of 5 nodes, cassandra 3.0.5. >>>>>>> I was running nodetool repair last days, one node at a time, when I >>>>>>> first encountered this exception >>>>>>> >>>>>>> *ERROR [ValidationExecutor:11] 2016-09-27 16:12:20,409 >>>>>>> CassandraDaemon.java:195 - Exception in thread >>>>>>> Thread[ValidationExecutor:11,1,main]* >>>>>>> *java.lang.RuntimeException: Cannot start multiple repair sessions >>>>>>> over the same sstables* >>>>>>> * at >>>>>>> org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToV= alidate(CompactionManager.java:1194) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> org.apache.cassandra.db.compaction.CompactionManager.doValidationCo= mpaction(CompactionManager.java:1084) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> org.apache.cassandra.db.compaction.CompactionManager.access$700(Com= pactionManager.java:80) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> org.apache.cassandra.db.compaction.CompactionManager$10.call(Compac= tionManager.java:714) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at >>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecuto= r.java:1142) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at >>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecut= or.java:617) >>>>>>> [na:1.8.0_60]* >>>>>>> * at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]* >>>>>>> >>>>>>> On some of the other boxes I see this: >>>>>>> >>>>>>> >>>>>>> *Caused by: org.apache.cassandra.exceptions.RepairException: [repai= r >>>>>>> #9dd21ab0-83f4-11e6-b28f-df99132d7979 on notes/operator_source_mv, >>>>>>> [(-7505573573695693981,-7495786486761919991],* >>>>>>> *....* >>>>>>> * (-8483612809930827919,-8480482504800860871]]] Validation failed i= n >>>>>>> /10.45.113.67 * >>>>>>> * at >>>>>>> org.apache.cassandra.repair.ValidationTask.treesReceived(Validation= Task.java:68) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> org.apache.cassandra.repair.RepairSession.validationComplete(Repair= Session.java:183) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> org.apache.cassandra.service.ActiveRepairService.handleMessage(Acti= veRepairService.java:408) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairM= essageVerbHandler.java:168) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at org.apache.cassandra.net >>>>>>> .MessageDeliveryTask.run(MessageDe= liveryTask.java:67) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:= 511) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at >>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecuto= r.java:1142) >>>>>>> [na:1.8.0_60]* >>>>>>> * at >>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecut= or.java:617) >>>>>>> [na:1.8.0_60]* >>>>>>> * at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]* >>>>>>> *ERROR [RepairJobTask:3] 2016-09-26 16:39:33,096 >>>>>>> CassandraDaemon.java:195 - Exception in thread Thread[RepairJobTask= :3,5,RMI >>>>>>> Runtime]* >>>>>>> *java.lang.AssertionError: java.lang.InterruptedException* >>>>>>> * at org.apache.cassandra.net >>>>>>> .OutboundTcpConnection.enqueue(Out= boundTcpConnection.java:172) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at org.apache.cassandra.net >>>>>>> .MessagingService.sendOneWay(Messa= gingService.java:761) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at org.apache.cassandra.net >>>>>>> .MessagingService.sendOneWay(Messa= gingService.java:729) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> org.apache.cassandra.repair.ValidationTask.run(ValidationTask.java:= 56) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * at >>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecuto= r.java:1142) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at >>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecut= or.java:617) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_60]* >>>>>>> *Caused by: java.lang.InterruptedException: null* >>>>>>> * at >>>>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterr= uptibly(AbstractQueuedSynchronizer.java:1220) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at >>>>>>> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(Reentran= tLock.java:335) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at >>>>>>> java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.ja= va:339) >>>>>>> ~[na:1.8.0_60]* >>>>>>> * at org.apache.cassandra.net >>>>>>> .OutboundTcpConnection.enqueue(Out= boundTcpConnection.java:168) >>>>>>> ~[apache-cassandra-3.0.5.jar:3.0.5]* >>>>>>> * ... 6 common frames omitted* >>>>>>> >>>>>>> >>>>>>> Now if I run nodetool repair I get the >>>>>>> >>>>>>> *java.lang.RuntimeException: Cannot start multiple repair sessions >>>>>>> over the same sstables* >>>>>>> >>>>>>> exception. >>>>>>> What do you suggest? would nodetool scrub or sstablescrub help in >>>>>>> this case. or it would just make it worse? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Robert >>>>>>> >>>>>> -- >>>>>> ----------------- >>>>>> Alexander Dejanovski >>>>>> France >>>>>> @alexanderdeja >>>>>> >>>>>> Consultant >>>>>> Apache Cassandra Consulting >>>>>> http://www.thelastpickle.com >>>>>> >>>>> >>>>> -- >>>> ----------------- >>>> Alexander Dejanovski >>>> France >>>> @alexanderdeja >>>> >>>> Consultant >>>> Apache Cassandra Consulting >>>> http://www.thelastpickle.com >>>> >>> >>> -- >> ----------------- >> Alexander Dejanovski >> France >> @alexanderdeja >> >> Consultant >> Apache Cassandra Consulting >> http://www.thelastpickle.com >> > > -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com --001a11422cae0b2990053d934359 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Robert,

You can restart them in any order, that doesn't make a d= ifference afaik.

Cheers


Le mer. 28 sept. 2016 17:10= , Robert Sicoie <robert.sicoi= e@gmail.com> a =C3=A9crit=C2=A0:
Thanks Alexander,

Yes, with tpstat= s I can see the hanging active repair(s) (output attached). For one there a= re 31 pending repair. On others there are less pending repairs (min 12). Is= there any recomandation for the restart order? The one with more less pend= ing repairs first, perhaps?

Thanks,
Robe= rt
<= br clear=3D"all">
Robert Sicoie=

On Wed, Sep 28, 2016 at 5:35 PM, Alexander D= ejanovski <alex@thelastpickle.com> wrote:
They will show up in nodetool compact= ionstats :=C2=A0https://issues.apache.org/jira/browse/CASSANDRA-909= 8

Did you check nodetool tpstats to see if you didn&= #39;t have any running repair session ?=C2=A0
Just to make sure (= and if you can actually do it), roll restart the cluster and try again. Rep= air sessions can get sticky sometimes.

On Wed, Sep 28, 2016 at 4:23 PM Robert Si= coie <rober= t.sicoie@gmail.com> wrote:
<= div dir=3D"ltr">I am using nodetool compactionstats to check for pending co= mpactions and it shows me 0 pending on all nodes, seconds before running no= detool repair.
I am also monitoring PendingCompactions on jmx.

Is there other way I can find out if is there any anticomp= action running on any node?

Thanks a lot,
Robert

Robert = Sicoie

On Wed, Sep 28, 2016 at 4:44 PM, Alexander D= ejanovski <alex@thelastpickle.com> wrote:
Robert,=C2=A0

you = need to make sure you have no repair session currently running on your clus= ter, and no anticompaction.
I'd recommend doing a rolling res= tart in order to stop all running repair for sure, then start the process a= gain, node by node, checking that no anticompaction is running before movin= g from one node to the other.

Please do not use th= e -pr switch as it is both useless (token ranges are repaired only once wit= h inc repair, whatever the replication factor) and harmful as all anticompa= ctions won't be executed (you'll still have sstables marked as unre= paired even if the process has ran entirely with no error).

<= /div>
Let us know how that goes.

Cheers,
=

On Wed, Sep= 28, 2016 at 2:57 PM Robert Sicoie <robert.sicoie@gmail.com> wrote:
Thanks Alexander,

Now I started to run the repair with -pr arg and with keyspace and = table args.=C2=A0
Still, I got the "ERROR [RepairJobTask:1] = 2016-09-28 11:34:38,288 RepairRunnable.java:246 - Repair session 89af4d10-8= 56f-11e6-b28f-df99132d7979 for range [(8323429577695061526,8326640819362122= 791], ...,=C2=A0(4212695343340915405,4229348077081465596]]] Validation fail= ed in /10.45.113.88&q= uot;

for one of the tables. 10.45.113.88 is the ip= of the machine I am running the nodetool on.
I'm wondering i= f this is normal...

Thanks,
Robert
=




Robert Sicoie

On Wed, Sep 28, 2016 at 11:53 AM, Alexander = Dejanovski <alex@thelastpickle.com> wrote:
Hi,=C2=A0

nodetoo= l scrub won't help here, as what you're experiencing is most likely= that one SSTable is going through anticompaction, and then another node is= asking for a Merkle tree that involves it.
For understandable re= asons, an SSTable cannot be anticompacted and validation compacted at the s= ame time.

The solution here is to adjust the repai= r pressure on your cluster so that anticompaction can end before you run re= pair on another node.
You may have a lot of anticompaction to do = if you had high volumes of unrepaired data, which can take a long time depe= nding on several factors.

You can tune your repair= process to make sure no anticompaction is running before launching a new s= ession on another node or you can try my Reaper fork that handles increment= al repair :=C2=A0https://github.com/adej= anovski/cassandra-reaper/tree/inc-repair-support-with-ui
I ma= y have to add a few checks in order to avoid all collisions between anticom= pactions and new sessions, but it should be helpful if you struggle with in= cremental repair.

In any case, check if your nodes= are still anticompacting before trying to run a new repair session on a no= de.

Cheers,
=C2=A0
=
On Wed, Sep 28, 2016 at 10:= 31 AM Robert Sicoie <robert.sicoie@gmail.com> wrote:
Hi guys,=C2=A0

I have a cluster of 5 nodes, cassandra 3.0.5.
I was running no= detool repair last days, one node at a time, when I first encountered this = exception

ERROR [ValidationExecutor:11] 20= 16-09-27 16:12:20,409 CassandraDaemon.java:195 - Exception in thread Thread= [ValidationExecutor:11,1,main]
java.lang.RuntimeException:= Cannot start multiple repair sessions over the same sstables
at org.apache.cassandra.db= .compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:= 1194) ~[apache-cassandra-3.0.5.jar:3.0.5]
at org.apache.cassandra.db.compaction.Compacti= onManager.doValidationCompaction(CompactionManager.java:1084) ~[apache-cass= andra-3.0.5.jar:3.0.5]
at org.apache.cassandra.db.compaction.CompactionManager.access$70= 0(CompactionManager.java:80) ~[apache-cassandra-3.0.5.jar:3.0.5]
<= div> at org.apache.cassandra= .db.compaction.CompactionManager$10.call(CompactionManager.java:714) ~[apac= he-cassandra-3.0.5.jar:3.0.5]
at java.util.concurrent.FutureTask.run(FutureTask.java:266= ) ~[na:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor= .java:1142) ~[na:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadP= oolExecutor.java:617) [na:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_6= 0]

On some of the other boxes I see this= :

Caused by: org.apache.cassandra.exceptions.Re= pairException: [repair #9dd21ab0-83f4-11e6-b28f-df99132d7979 on notes/opera= tor_source_mv, [(-7505573573695693981,-7495786486761919991],
<= div>....
=C2=A0(-8483612809930827919,-848048250480= 0860871]]] Validation failed in /10.45.113.67
at org.apache.cassandra.repair.ValidationTask.treesReceived(Val= idationTask.java:68) ~[apache-cassandra-3.0.5.jar:3.0.5]
<= span style=3D"white-space:pre-wrap"> at org.apache.cassandra.repair.= RepairSession.validationComplete(RepairSession.java:183) ~[apache-cassandra= -3.0.5.jar:3.0.5]
at org.apache.cassandra.service.ActiveRepairService.handleMessage(Acti= veRepairService.java:408) ~[apache-cassandra-3.0.5.jar:3.0.5]
at org.apache.cassandra.re= pair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:168) ~[a= pache-cassandra-3.0.5.jar:3.0.5]
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDel= iveryTask.java:67) ~[apache-cassandra-3.0.5.jar:3.0.5]
at java.util.concurrent.Executors= $RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_60]
<= span style=3D"white-space:pre-wrap"> at java.util.concurrent.FutureT= ask.run(FutureTask.java:266) ~[na:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor.r= unWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_60]
at java.util.concurrent.ThreadPoolE= xecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60]
at java.lang.Thread.run(Th= read.java:745) [na:1.8.0_60]
ERROR [RepairJobTask:3] 2016-= 09-26 16:39:33,096 CassandraDaemon.java:195 - Exception in thread Thread[Re= pairJobTask:3,5,RMI Runtime]
java.lang.AssertionError: jav= a.lang.InterruptedException
at org.apache.cassandra.net.OutboundTcpConnection.enqueue(OutboundTcp= Connection.java:172) ~[apache-cassandra-3.0.5.jar:3.0.5]
<= span style=3D"white-space:pre-wrap"> at org.apache.cassandra.net.MessagingSer= vice.sendOneWay(MessagingService.java:761) ~[apache-cassandra-3.0.5.jar:3.0= .5]
at org.apache.cassandr= a.net.MessagingService.sendOneWay(MessagingService.java:729) ~[apache-c= assandra-3.0.5.jar:3.0.5]
at org.apache.cassandra.repair.ValidationTask.run(ValidationTa= sk.java:56) ~[apache-cassandra-3.0.5.jar:3.0.5]
at java.util.concurrent.ThreadPoolExecut= or.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_60]
= at java.util.concurrent.Thread= PoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_60]
at java.lang.Thread.= run(Thread.java:745) ~[na:1.8.0_60]
Caused by: java.lang.I= nterruptedException: null
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acqui= reInterruptibly(AbstractQueuedSynchronizer.java:1220) ~[na:1.8.0_60]
at java.util.concur= rent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) ~[na:1.8= .0_60]
at jav= a.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339) ~[n= a:1.8.0_60]
a= t org.apache.= cassandra.net.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:= 168) ~[apache-cassandra-3.0.5.jar:3.0.5]
... 6 common frames omitted


Now if I run nodetool repair I get the=C2=A0=

java.lang.RuntimeException: Cannot star= t multiple repair sessions over the same sstables

exception.
What do you suggest? would nodetool scrub or s= stablescrub help in this case. or it would just make it worse?
Thanks,

Robert
--
-----------------
Alex= ander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Co= nsulting

--
<= div style=3D"font-family:"helvetica neue",helvetica,arial,sans-se= rif;line-height:19.5px">-----------------
Ale= xander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Co= nsulting

--
-----------------
Alexa= nder Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Co= nsulting

--
-----------------
Alexa= nder Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Co= nsulting
--001a11422cae0b2990053d934359--