Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: aaron morton <aaron@thelastpickle.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_9C21151F-34C2-4D39-9C46-C5ED522C7444"
Message-Id: <E53555C4-69CC-4E96-B2C8-A847F1132198@thelastpickle.com>
Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1485\))
Subject: Re: nodetool repair - when is it not needed ?
Date: Fri, 24 Aug 2012 10:15:03 +1200
References: 
 <CAPVoXPz1sLYht2ZZXR1q5cSLSj40pShPm7PkBSDeL+zUtqCJWw@mail.gmail.com>
 <CACHzRHbtHaFTb_9AX3q9mOq8OFm-FkahA-3FmTnW=1Mb8oQC_Q@mail.gmail.com>
 <B2BAD0C8-D435-4003-A0AA-4AF2709443E7@thelastpickle.com>
To: user@cassandra.apache.org
In-Reply-To: <B2BAD0C8-D435-4003-A0AA-4AF2709443E7@thelastpickle.com>


--Apple-Mail=_9C21151F-34C2-4D39-9C46-C5ED522C7444
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

> Also when hints are replayed they are sent of as mutations, which may =
still be dropped by the target if they are not serviced before =
rpc_timeout. Sending nodes throttle their requests so it's unlikely but =
possible.=20

My bad there. I thought the mutations were send one way.=20

When node is sending hints it waits the normal rpc_timeout. If there is =
a time out hint delivery for that endpoint is aborted. It will be =
re-tried the in the next HH round, which is every 10 minutes.=20

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/08/2012, at 9:36 PM, aaron morton <aaron@thelastpickle.com> wrote:

> HH works to a point. Specifically, it only collects hints for the =
first hour the node is down and it has a safety valve to avoid the node =
collecting hints getting overwhelmed. Looking at the code it takes a bit =
for that the trip and you would get a TimeoutException coming back.=20
>=20
> Also when hints are replayed they are sent of as mutations, which may =
still be dropped by the target if they are not serviced before =
rpc_timeout. Sending nodes throttle their requests so it's unlikely but =
possible.=20
>=20
> HH is is much more robust, but AFAIK repair is still _the_ way to =
ensure on disk consistency.=20
>=20
> Cheers
>=20
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>=20
> On 23/08/2012, at 6:59 AM, Rob Coli <rcoli@palominodb.com> wrote:
>=20
>> On Wed, Aug 22, 2012 at 8:37 AM, Senthilvel Rangaswamy
>> <senthilvel@gmail.com> wrote:
>>> We are running Cassandra 1.1.2 on EC2. Our database is primarily all
>>> counters and we don't do any
>>> deletes.
>>>=20
>>> Does nodetool repair do anything for such a database. All the docs I =
read
>>> for nodetool repair suggests
>>> that nodetool repair is needed only if there is deletes.
>>=20
>> Since 1.0, repair is only needed if a node crashes. If a node =
crashes,
>> my understanding is that a cluster-wide repair (with -pr on each =
node)
>> is required, because the crashed node could have lost a hint for any
>> other node.
>>=20
>> https://issues.apache.org/jira/browse/CASSANDRA-2034
>>=20
>> =3DRob
>>=20
>> --=20
>> =3DRobert Coli
>> AIM&GTALK - rcoli@palominodb.com
>> YAHOO - rcoli.palominob
>> SKYPE - rcoli_palominodb
>=20


--Apple-Mail=_9C21151F-34C2-4D39-9C46-C5ED522C7444
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><blockquote type=3D"cite"><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Also =
when hints are replayed they are sent of as mutations, which may still =
be dropped by the target if they are not serviced before rpc_timeout. =
Sending nodes throttle their requests so it's unlikely but =
possible.&nbsp;<br></div></blockquote><div><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; "><br></div></div><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; ">My bad there. I thought the mutations were send one =
way.&nbsp;</div><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><br></div><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; ">When node is sending hints it =
waits the normal rpc_timeout. If there is a time out hint delivery for =
that endpoint is aborted. It will be re-tried the in the next HH round, =
which is every 10 minutes.&nbsp;</div><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; "><br></div><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
">Cheers</div><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><br></div><div =
apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 23/08/2012, at 9:36 PM, aaron morton &lt;<a =
href=3D"mailto:aaron@thelastpickle.com">aaron@thelastpickle.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">HH =
works to a point. Specifically, it only collects hints for the first =
hour the node is down and it has a safety valve to avoid the node =
collecting hints getting overwhelmed. Looking at the code it takes a bit =
for that the trip and you would get a&nbsp;TimeoutException coming =
back.&nbsp;<div><br></div><div>Also when hints are replayed they are =
sent of as mutations, which may still be dropped by the target if they =
are not serviced before rpc_timeout. Sending nodes throttle their =
requests so it's unlikely but possible.&nbsp;<br><div><br></div><div>HH =
is is much more robust, but AFAIK repair is still _the_ way to ensure on =
disk =
consistency.&nbsp;</div><div><br></div><div>Cheers</div><div><br><div =
apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: =
none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
border-spacing: 0px; -webkit-text-decorations-in-effect: none; =
-webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; =
font-size: medium; "><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com/">http://www.thelastpickle.com</a></d=
iv></div></span></div></span></div></span></span>
</div>

<br><div><div>On 23/08/2012, at 6:59 AM, Rob Coli &lt;<a =
href=3D"mailto:rcoli@palominodb.com">rcoli@palominodb.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">On Wed, Aug 22, 2012 at 8:37 AM, Senthilvel =
Rangaswamy<br>&lt;<a =
href=3D"mailto:senthilvel@gmail.com">senthilvel@gmail.com</a>&gt; =
wrote:<br><blockquote type=3D"cite">We are running Cassandra 1.1.2 on =
EC2. Our database is primarily all<br>counters and we don't do =
any<br>deletes.<br><br>Does nodetool repair do anything for such a =
database. All the docs I read<br>for nodetool repair suggests<br>that =
nodetool repair is needed only if there is =
deletes.<br></blockquote><br>Since 1.0, repair is only needed if a node =
crashes. If a node crashes,<br>my understanding is that a cluster-wide =
repair (with -pr on each node)<br>is required, because the crashed node =
could have lost a hint for any<br>other node.<br><br><a =
href=3D"https://issues.apache.org/jira/browse/CASSANDRA-2034">https://issu=
es.apache.org/jira/browse/CASSANDRA-2034</a><br><br>=3DRob<br><br>-- =
<br>=3DRobert Coli<br>AIM&amp;GTALK - <a =
href=3D"mailto:rcoli@palominodb.com">rcoli@palominodb.com</a><br>YAHOO - =
rcoli.palominob<br>SKYPE - =
rcoli_palominodb<br></blockquote></div><br></div></div></div></blockquote>=
</div><br></body></html>=

--Apple-Mail=_9C21151F-34C2-4D39-9C46-C5ED522C7444--