Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: aaron morton <aaron@thelastpickle.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_361828D8-5BDB-48AB-9988-6BC972D872F1"
Message-Id: <49883A9A-EB50-44C4-B867-754BE5258453@thelastpickle.com>
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Stream fails during repair, two nodes out-of-memory
Date: Mon, 25 Mar 2013 06:11:02 +1300
References: <1364000318.42306.GenericBBA@web160902.mail.bf1.yahoo.com>
 <CAL7B-cD9idq_uxyp+zjuaVdmTt2nAwGe0t25MeDOWTvCW=UhZg@mail.gmail.com>
To: user@cassandra.apache.org
In-Reply-To: 
 <CAL7B-cD9idq_uxyp+zjuaVdmTt2nAwGe0t25MeDOWTvCW=UhZg@mail.gmail.com>


--Apple-Mail=_361828D8-5BDB-48AB-9988-6BC972D872F1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

> compaction needs some disk I/O. Slowing down our compaction will =
improve overall system performance. Of course, you don't want to go too =
slow and fall behind too much.
In this case I was thinking of the memory use.=20
Compaction tasks are a bit like a storm of reads. If you are having =
problems with memory management all those reads can result in increased =
GC.=20

> It looks like we hit OOM when repair starts streaming
> multiple cfs simultaneously.=20
Odd. It's not very memory intensive.=20

> I'm wondering if I should throttle streaming, and/or repair only one
> CF at a time.

Decreasing stream_throughput_outbound_megabits_per_sec may help, if the =
goal is just to get repair working.=20

You may also want to increase phi_convict_threshold to 12, this will =
make it harder for a node to get marked as down. Which can be handy when =
GC is causing problems and you have under powered nodes. If the node is =
marked as down the repair session will fail instantly.=20

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 24/03/2013, at 9:12 AM, Dane Miller <dane@optimalsocial.com> wrote:

> On Fri, Mar 22, 2013 at 5:58 PM, Wei Zhu <wz1975@yahoo.com> wrote:
>> compaction needs some disk I/O. Slowing down our compaction will =
improve overall
>> system performance. Of course, you don't want to go too slow and fall =
behind too much.
>=20
> Hmm.  Even after making the suggested configuration changes, repair
> still fails with OOM (but only one node died this time, which is an
> improvement).  It looks like we hit OOM when repair starts streaming
> multiple cfs simultaneously.  Just prior to OOM, the node loses
> contact with another node in the cluster and starts storing hints.
>=20
> I'm wondering if I should throttle streaming, and/or repair only one
> CF at a time.
>=20
>> From: "Dane Miller"
>> Subject: Re: Stream fails during repair, two nodes out-of-memory
>>=20
>> On Thu, Mar 21, 2013 at 10:28 AM, aaron morton =
<aaron@thelastpickle.com> wrote:
>>> heap of 1867M is kind of small. According to the discussion on this =
list,
>>> it's advisable to have m1.xlarge.
>>>=20
>>> +1
>>>=20
>>> In cassadrea-env.sh set the MAX_HEAP_SIZE to 4GB, and the =
NEW_HEAP_SIZE to
>>> 400M
>>>=20
>>> In the yaml file set
>>>=20
>>> in_memory_compaction_limit_in_mb to 32
>>> compaction_throughput_mb_per_sec to 8
>>> concurrent_compactors to 2
>>>=20
>>> This will slow down compaction a lot. You may want to restore some =
of these
>>> settings once you have things stable.
>>>=20
>>> You have an under powered box for what you are trying to do.
>>=20
>> Thanks very much for the info.  Have made the changes and am =
retrying.
>> I'd like to understand, why does it help to slow compaction?
>>=20
>> It does seem like the cluster is under powered to handle our
>> application's full write load plus repairs, but it operates fine
>> otherwise.
>>=20
>> On Wed, Mar 20, 2013 at 8:47 PM, Wei Zhu <wz1975@yahoo.com> wrote:
>>> It's clear you are out of memory. How big is your data size?
>>=20
>> 120 GB per node, of which 50% is actively written/updated, and 50% is
>> read-mostly.
>>=20
>> Dane
>>=20


--Apple-Mail=_361828D8-5BDB-48AB-9988-6BC972D872F1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><blockquote type=3D"cite">compaction needs some disk I/O. Slowing down =
our compaction will improve overall system performance. Of course, you =
don't want to go too slow and fall behind too =
much.<br></blockquote><div>In this case I was thinking of the memory =
use.&nbsp;</div><div>Compaction tasks are a bit like a storm of reads. =
If you are having problems with memory management all those reads can =
result in increased GC.&nbsp;</div><div><br></div><blockquote =
type=3D"cite">It looks like we hit OOM when repair starts =
streaming<br>multiple cfs simultaneously.&nbsp;</blockquote>Odd. It's =
not very memory intensive.&nbsp;<div><br></div><div><blockquote =
type=3D"cite">I'm wondering if I should throttle streaming, and/or =
repair only one<br>CF at a time.<br></blockquote></div><div>Decreasing =
stream_throughput_outbound_megabits_per_sec may help, if the goal is =
just to get repair working.&nbsp;</div><div><br></div><div>You may also =
want to increase&nbsp;phi_convict_threshold to 12, this will make it =
harder for a node to get marked as down. Which can be handy when GC is =
causing problems and you have under powered nodes. If the node is marked =
as down the repair session will fail =
instantly.&nbsp;</div><div><br></div><div>Cheers</div><div><br><div><div =
apple-content-edited=3D"true">
<div style=3D"color: rgb(0, 0, 0); font-family: Helvetica; font-size: =
medium; font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
-webkit-auto; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div =
style=3D"color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; =
font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
-webkit-auto; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
border-spacing: 0px; -webkit-text-decorations-in-effect: none; =
-webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; =
font-size: medium; "><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Cassandra Consultant</div><div>New =
Zealand</div><div><br></div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></span></div></span></div></span></div></span></div></div>
</div>

<br><div><div>On 24/03/2013, at 9:12 AM, Dane Miller &lt;<a =
href=3D"mailto:dane@optimalsocial.com">dane@optimalsocial.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">On Fri, Mar 22, 2013 at 5:58 PM, Wei Zhu &lt;<a =
href=3D"mailto:wz1975@yahoo.com">wz1975@yahoo.com</a>&gt; =
wrote:<br><blockquote type=3D"cite">compaction needs some disk I/O. =
Slowing down our compaction will improve overall<br>system performance. =
Of course, you don't want to go too slow and fall behind too =
much.<br></blockquote><br>Hmm. &nbsp;Even after making the suggested =
configuration changes, repair<br>still fails with OOM (but only one node =
died this time, which is an<br>improvement). &nbsp;It looks like we hit =
OOM when repair starts streaming<br>multiple cfs simultaneously. =
&nbsp;Just prior to OOM, the node loses<br>contact with another node in =
the cluster and starts storing hints.<br><br>I'm wondering if I should =
throttle streaming, and/or repair only one<br>CF at a =
time.<br><br><blockquote type=3D"cite">From: "Dane Miller"<br>Subject: =
Re: Stream fails during repair, two nodes out-of-memory<br><br>On Thu, =
Mar 21, 2013 at 10:28 AM, aaron morton &lt;<a =
href=3D"mailto:aaron@thelastpickle.com">aaron@thelastpickle.com</a>&gt; =
wrote:<br><blockquote type=3D"cite">heap of 1867M is kind of small. =
According to the discussion on this list,<br>it's advisable to have =
m1.xlarge.<br><br>+1<br><br>In cassadrea-env.sh set the MAX_HEAP_SIZE to =
4GB, and the NEW_HEAP_SIZE to<br>400M<br><br>In the yaml file =
set<br><br>in_memory_compaction_limit_in_mb to =
32<br>compaction_throughput_mb_per_sec to 8<br>concurrent_compactors to =
2<br><br>This will slow down compaction a lot. You may want to restore =
some of these<br>settings once you have things stable.<br><br>You have =
an under powered box for what you are trying to =
do.<br></blockquote><br>Thanks very much for the info. &nbsp;Have made =
the changes and am retrying.<br> I'd like to understand, why does it =
help to slow compaction?<br><br>It does seem like the cluster is under =
powered to handle our<br>application's full write load plus repairs, but =
it operates fine<br>otherwise.<br><br>On Wed, Mar 20, 2013 at 8:47 PM, =
Wei Zhu &lt;<a href=3D"mailto:wz1975@yahoo.com">wz1975@yahoo.com</a>&gt; =
wrote:<br><blockquote type=3D"cite">It's clear you are out of memory. =
How big is your data size?<br></blockquote><br>120 GB per node, of which =
50% is actively written/updated, and 50% =
is<br>read-mostly.<br><br>Dane<br><br></blockquote></blockquote></div><br>=
</div></div></body></html>=

--Apple-Mail=_361828D8-5BDB-48AB-9988-6BC972D872F1--