Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of
 btv1==7625952efa8==mkjellman@barracuda.com designates 64.235.145.82 as
 permitted sender)
From: Michael Kjellman <mkjellman@barracuda.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Tue, 19 Feb 2013 08:29:49 -0800
Subject: Re: Long running nodetool repair
Thread-Topic: Long running nodetool repair
Thread-Index: Ac4OvlU4VawG7KBeSUmU8EwdtbcpTQ==
Message-ID: <CD48E78B.11DC7%mkjellman@barracuda.com>
In-Reply-To: 
 <CAKimya1L4=t0sLz=-_e+C_bQb=9fr8R9vgWJuFk-+=YBm2MKcQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/14.3.0.121105
acceptlanguage: en-US
Content-Type: multipart/alternative;
	boundary="_000_CD48E78B11DC7mkjellmanbarracudacom_"
MIME-Version: 1.0
Received-SPF: softfail (barracuda.com: domain of transitioning
 mkjellman@barracuda.com does not designate ::1 as permitted sender)

--_000_CD48E78B11DC7mkjellmanbarracudacom_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

This is very normal (unfortunately). Are you doing a repair =96pr or a stra=
ight up repair?

Does nodetool netstats show anything? I frequently see repair hang in 1.2.1=
, and I haven't been able to figure out why yet though. Feel free to take a=
 stack dump with jstack on the node doing the repair and see if there are a=
ny deadlocks potentially occurring after the merkel tree's are received.

And to help more, do you have the last logs after AntiEntrophy? Any streami=
ng sessions from other nodes?

Bug is being tracked here: https://issues.apache.org/jira/browse/CASSANDRA-=
5146

Best,
Michael

From: Haithem Jarraya <haithem.jarraya@struq.com<mailto:haithem.jarraya@str=
uq.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <us=
er@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, February 19, 2013 1:29 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cas=
sandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Long running nodetool repair

Hi,

I am new to Cassandra and I am not sure if this is the normal behavior but =
nodetool repair runs for too long even for small dataset per node. As I am =
writing I started a nodetool repair last night at 18:41 and now it's 9:18 a=
nd it's still running, the size of my data is only ~500mb per node.
We have
3 Node cluster in DC1 with RF 3
1 Node Cluster in DC2 with RF 1
1 Node cluster in DC3 with RF 1

and running Cassandra V1.2.1 with 256 vNodes.

>From cassandra logs I do not see AntiEntropy logs anymore only compaction T=
ask and FlushWriter.

Is this a normal behaviour of nodetool repair?
Is the running time grow linearly with the size of the data?

Any help or direction will be much appreciated.


Thanks,

H

--_000_CD48E78B11DC7mkjellmanbarracudacom_
Content-Type: text/html; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

<html><head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DWindows-1=
252"></head><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: space;=
 -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14p=
x; font-family: Calibri, sans-serif; "><div>This is very normal (unfortunat=
ely). Are you doing a repair =96pr or a straight up repair?</div><div><br><=
/div><div>Does nodetool netstats show anything? I frequently see repair han=
g in 1.2.1, and I haven't been able to figure out why yet though. Feel free=
 to take a stack dump with jstack on the node doing the repair and see if t=
here are any deadlocks potentially occurring after the merkel tree's are re=
ceived.</div><div><br></div><div>And to help more, do you have the last log=
s after AntiEntrophy? Any streaming sessions from other nodes?</div><div><b=
r></div><div>Bug is being tracked here:&nbsp;<a href=3D"https://issues.apac=
he.org/jira/browse/CASSANDRA-5146">https://issues.apache.org/jira/browse/CA=
SSANDRA-5146</a></div><div><br></div><div>Best,</div><div>Michael</div><div=
><br></div><span id=3D"OLK_SRC_BODY_SECTION"><div style=3D"font-family:Cali=
bri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium no=
ne; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDI=
NG-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PA=
DDING-TOP: 3pt"><span style=3D"font-weight:bold">From: </span> Haithem Jarr=
aya &lt;<a href=3D"mailto:haithem.jarraya@struq.com">haithem.jarraya@struq.=
com</a>&gt;<br><span style=3D"font-weight:bold">Reply-To: </span> &quot;<a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&quo=
t; &lt;<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.o=
rg</a>&gt;<br><span style=3D"font-weight:bold">Date: </span> Tuesday, Febru=
ary 19, 2013 1:29 AM<br><span style=3D"font-weight:bold">To: </span> &quot;=
<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&=
quot; &lt;<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apach=
e.org</a>&gt;<br><span style=3D"font-weight:bold">Subject: </span> Long run=
ning nodetool repair<br></div><div><br></div><div><div><div dir=3D"ltr">Hi,
<div><br></div><div style=3D"">I am new to Cassandra and I am not sure if t=
his is the normal&nbsp;behavior&nbsp;but nodetool repair runs for too long =
even for small dataset per node. As I am writing I started a nodetool repai=
r last night at 18:41 and now it's 9:18 and it's still running,
 the size of my data is only ~500mb per node.</div><div style=3D"">We have<=
/div><div style=3D"">3 Node cluster in DC1 with RF 3</div><div style=3D"">1=
 Node Cluster in DC2 with RF 1</div><div style=3D"">1 Node cluster in DC3 w=
ith RF 1</div><div style=3D""><br></div><div style=3D"">and running Cassand=
ra V1.2.1 with 256 vNodes.</div><div style=3D""><br></div><div style=3D"">F=
rom cassandra logs I do not see AntiEntropy logs anymore only compaction Ta=
sk and FlushWriter.</div><div style=3D""><br></div><div style=3D"">Is this =
a normal behaviour of nodetool repair?</div><div style=3D"">Is the running =
time grow linearly with the size of the data?</div><div style=3D""><br></di=
v><div style=3D"">Any help or direction will be much appreciated.</div><div=
 style=3D""><br></div><div style=3D""><br></div><div style=3D"">Thanks,</di=
v><div style=3D""><br></div><div style=3D"">H</div></div></div></div></span=
></body></html>

--_000_CD48E78B11DC7mkjellmanbarracudacom_--