Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from
	:mime-version:content-type:subject:date:in-reply-to:to
	:references:message-id; q=dns; s=thelastpickle.com; b=xwJ2m5zlEF
	D9/uI1lXQdQ9NxiK75ME1JKZDVxvPFedYIsSdELHhJJkovNRzjGeYj6rdQHTDIiA
	ElC4ACbzBFxhw4RxYB2gMNZhERxxwkUeim2oLwF14mqo9Si50rV3YFLn71ZwXB7x
	8LR4ytdreuj9NzZzFyTgA2A9dZeKw+VaM=
From: aaron morton <aaron@thelastpickle.com>
Mime-Version: 1.0 (Apple Message framework v1257)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_FE127A22-0876-4ED6-ACEC-80CC708D635C"
Subject: Re: nodetool repair taking forever
Date: Tue, 22 May 2012 21:05:18 +1200
In-Reply-To: 
 <CAF+j8anKrfPYrGUTg9LBcMCA2vaUmhuw7FAYP+LF6yvyrefDbw@mail.gmail.com>
To: user@cassandra.apache.org
References: 
 <CAF+j8anKrfPYrGUTg9LBcMCA2vaUmhuw7FAYP+LF6yvyrefDbw@mail.gmail.com>
Message-Id: <CA248CB7-5090-4711-8734-AD03CBE4C0BA@thelastpickle.com>


--Apple-Mail=_FE127A22-0876-4ED6-ACEC-80CC708D635C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

> I also dont understand if all these nodes are replicas of each other =
why is that the first node has almost double the data.
Have you performed any token moves ? Old data is not deleted unless you =
run nodetool cleanup.=20
Another possibility is things like a lot of hints. Admittedly it would =
have to be a *lot* of hints.
The third is that compaction has fallen behind.=20

> This week its even worse, the nodetool repair has been running for the =
last 15 hours just on the first node and when I run nodetool =
compactionstats I constantly see this -
>=20
> pending tasks: 3
First check the logs for errors.=20

Repair will first calculate the differences, you can see this as a =
validation compaction in nodetool compactionstats.
Then it will stream the data, you can watch that with nodetool netstats.=20=


Try to work out which part is taking the most time.   15 hours for 50Gb =
sounds like a long time (btw do you have compaction on ?)

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 20/05/2012, at 3:14 AM, Raj N wrote:

> Hi experts,
>=20
> I have a 6 node cluster spread across 2 DCs.=20
>=20
>     DC          Rack        Status State   Load            Owns    =
Token
>                                                                    =
113427455640312814857969558651062452225
>     DC1         RAC13       Up     Normal  95.98 GB        33.33%  0
>     DC2         RAC5        Up     Normal  50.79 GB        0.00%   1
>     DC1         RAC18       Up     Normal  50.83 GB        33.33%  =
56713727820156407428984779325531226112
>     DC2         RAC7        Up     Normal  50.74 GB        0.00%   =
56713727820156407428984779325531226113
>     DC1         RAC19       Up     Normal  61.72 GB        33.33%  =
113427455640312814857969558651062452224
>     DC2         RAC9        Up     Normal  50.83 GB        0.00%   =
113427455640312814857969558651062452225
>=20
> They are all replicas of each other. All reads and writes are done at =
LOCAL_QUORUM. We are on Cassandra 0.8.4. I see that our weekend nodetool =
repair runs for more than 12 hours. Especially on the first one which =
has 96 GB data. Is this usual? We are using 500 GB SAS drives with ext4 =
file system. This gets worse every week. This week its even worse, the =
nodetool repair has been running for the last 15 hours just on the first =
node and when I run nodetool compactionstats I constantly see this -
>=20
> pending tasks: 3
>=20
> and nothing else. Looks like its just stuck. There's nothing =
substantial in the logs as well. I also dont understand if all these =
nodes are replicas of each other why is that the first node has almost =
double the data. Any help will be really appreciated.
>=20
> Thanks
> -Raj


--Apple-Mail=_FE127A22-0876-4ED6-ACEC-80CC708D635C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><blockquote type=3D"cite"><div>I also dont understand if all these =
nodes are replicas of each other why is that the first node has almost =
double the data.</div></blockquote>Have you performed any token moves ? =
Old data is not deleted unless you run nodetool =
cleanup.&nbsp;<div>Another possibility is things like a lot of hints. =
Admittedly it would have to be a *lot* of hints.</div><div>The third is =
that compaction has fallen =
behind.&nbsp;</div><div><br></div><div><blockquote type=3D"cite"><div>This=
 week its even worse, the nodetool repair has been running for the last =
15 hours just on the first node and when I run nodetool compactionstats =
I constantly see this -</div><div><br></div><div><div>pending tasks: =
3</div></div></blockquote>First check the logs for =
errors.&nbsp;</div><div><br></div><div>Repair will first calculate the =
differences, you can see this as a validation compaction in nodetool =
compactionstats.</div><div>Then it will stream the data, you can watch =
that with nodetool netstats.&nbsp;</div><div><br></div><div>Try to work =
out which part is taking the most time. &nbsp; 15 hours for 50Gb sounds =
like a long time (btw do you have compaction on =
?)</div><div><br></div><div>Cheers</div><div><br><div =
apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 20/05/2012, at 3:14 AM, Raj N wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><div>Hi =
experts,</div><div><br></div><div>I have a 6 node cluster spread across =
2 DCs.&nbsp;</div><div><br></div><div><div>&nbsp; &nbsp; DC &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;Rack &nbsp; &nbsp; &nbsp; &nbsp;Status State =
&nbsp; Load &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Owns &nbsp; =
&nbsp;Token</div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;113427455640312814857969558651062452225</div>

<div>&nbsp; &nbsp; DC1 &nbsp; &nbsp; &nbsp; &nbsp; RAC13 &nbsp; &nbsp; =
&nbsp; Up &nbsp; &nbsp; Normal &nbsp;95.98 GB &nbsp; &nbsp; &nbsp; =
&nbsp;33.33% &nbsp;0</div><div>&nbsp; &nbsp; DC2 &nbsp; &nbsp; &nbsp; =
&nbsp; RAC5 &nbsp; &nbsp; &nbsp; &nbsp;Up &nbsp; &nbsp; Normal =
&nbsp;50.79 GB &nbsp; &nbsp; &nbsp; &nbsp;0.00% &nbsp; =
1</div><div>&nbsp; &nbsp; DC1 &nbsp; &nbsp; &nbsp; &nbsp; RAC18 &nbsp; =
&nbsp; &nbsp; Up &nbsp; &nbsp; Normal &nbsp;50.83 GB &nbsp; &nbsp; =
&nbsp; &nbsp;33.33% &nbsp;56713727820156407428984779325531226112</div>

<div>&nbsp; &nbsp; DC2 &nbsp; &nbsp; &nbsp; &nbsp; RAC7 &nbsp; &nbsp; =
&nbsp; &nbsp;Up &nbsp; &nbsp; Normal &nbsp;50.74 GB &nbsp; &nbsp; &nbsp; =
&nbsp;0.00% &nbsp; =
56713727820156407428984779325531226113</div><div>&nbsp; &nbsp; DC1 =
&nbsp; &nbsp; &nbsp; &nbsp; RAC19 &nbsp; &nbsp; &nbsp; Up &nbsp; &nbsp; =
Normal &nbsp;61.72 GB &nbsp; &nbsp; &nbsp; &nbsp;33.33% =
&nbsp;113427455640312814857969558651062452224</div>

<div>&nbsp; &nbsp; DC2 &nbsp; &nbsp; &nbsp; &nbsp; RAC9 &nbsp; &nbsp; =
&nbsp; &nbsp;Up &nbsp; &nbsp; Normal &nbsp;50.83 GB &nbsp; &nbsp; &nbsp; =
&nbsp;0.00% &nbsp; =
113427455640312814857969558651062452225</div></div><div><br></div><div>The=
y are all replicas of each other. All reads and writes are done at =
LOCAL_QUORUM. We are on Cassandra 0.8.4. I see that our weekend nodetool =
repair runs for more than 12 hours. Especially on the first one which =
has 96 GB data. Is this usual? We are using 500 GB SAS drives with ext4 =
file system.&nbsp;This gets worse every week.&nbsp;This week its even =
worse, the nodetool repair has been running for the last 15 hours just =
on the first node and when I run nodetool compactionstats I constantly =
see this -</div>

<div><br></div><div><div>pending tasks: =
3</div></div><div><br></div><div>and nothing else. Looks like its just =
stuck. There's nothing substantial in the logs as well. I also dont =
understand if all these nodes are replicas of each other why is that the =
first node has almost double the data. Any help will be really =
appreciated.</div>

<div><br></div><div>Thanks</div><div>-Raj</div>
</blockquote></div><br></div></body></html>=

--Apple-Mail=_FE127A22-0876-4ED6-ACEC-80CC708D635C--