Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of peichieh@gmail.com designates
 209.85.223.180 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAP7WDFWxzxG_5k8XgB1Uc5Pj7OxHbXCaKJnDB7Oa8tY7M8nNMg@mail.gmail.com>
References: 
 <CAP7WDFV=DHPEaDcarOsktuw_sxGFeYvZF5eMAaZNYOoU8OEz-A@mail.gmail.com>
	<CAPaCpY8BjMM+NcP6RJg+dsBBZnpXQKxj53DQjTjGCB62Rrjk5g@mail.gmail.com>
	<CAP7WDFWxzxG_5k8XgB1Uc5Pj7OxHbXCaKJnDB7Oa8tY7M8nNMg@mail.gmail.com>
Date: Fri, 6 Dec 2013 23:58:46 +0800
Message-ID: 
 <CAHO4itzGiVksxT-bGQpFExfW8QmvxT_ckX=u--RM4xkaLGwDHQ@mail.gmail.com>
Subject: Re: Write performance with 1.2.12
From: Jason Wee <peichieh@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a11c1e384a345c104ecdfb8dd

--001a11c1e384a345c104ecdfb8dd
Content-Type: text/plain; charset=ISO-8859-1

Hi srmore,

Perhaps if you use jconsole and connect to the jvm using jmx. Then uner
MBeans tab, start inspecting the GC metrics.

/Jason


On Fri, Dec 6, 2013 at 11:40 PM, srmore <comomore@gmail.com> wrote:

>
>
>
> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
>
>> Hard to say much without knowing about the cassandra configurations.
>>
>
> The cassandra configuration is
> -Xms8G
> -Xmx8G
> -Xmn800m
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=4
> -XX:MaxTenuringThreshold=2
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
>
>
>
>> Yes compactions/GC's could skipe the CPU, I had similar behavior with my
>> setup.
>>
>
> Were you able to get around it ?
>
>
>>
>> -VK
>>
>>
>> On Fri, Dec 6, 2013 at 7:40 PM, srmore <comomore@gmail.com> wrote:
>>
>>> We have a 3 node cluster running cassandra 1.2.12, they are pretty big
>>> machines 64G ram with 16 cores, cassandra heap is 8G.
>>>
>>> The interesting observation is that, when I send traffic to one node its
>>> performance is 2x more than when I send traffic to all the nodes. We ran
>>> 1.0.11 on the same box and we observed a slight dip but not half as seen
>>> with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing
>>> CL to ONE make a slight improvement but not much.
>>>
>>> The read_Repair_chance is 0.1. We see some compactions running.
>>>
>>> following is my iostat -x output, sda is the ssd (for commit log) and
>>> sdb is the spinner.
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>           66.46    0.00    8.95    0.01    0.00   24.58
>>>
>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>>> avgqu-sz   await  svctm  %util
>>> sda               0.00    27.60  0.00  4.40     0.00   256.00
>>> 58.18     0.01    2.55   1.32   0.58
>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00
>>> 0.00     0.00    0.00   0.00   0.00
>>> sda2              0.00    27.60  0.00  4.40     0.00   256.00
>>> 58.18     0.01    2.55   1.32   0.58
>>> sdb               0.00     0.00  0.00  0.00     0.00     0.00
>>> 0.00     0.00    0.00   0.00   0.00
>>> sdb1              0.00     0.00  0.00  0.00     0.00     0.00
>>> 0.00     0.00    0.00   0.00   0.00
>>> dm-0              0.00     0.00  0.00  0.00     0.00     0.00
>>> 0.00     0.00    0.00   0.00   0.00
>>> dm-1              0.00     0.00  0.00  0.60     0.00     4.80
>>> 8.00     0.00    5.33   2.67   0.16
>>> dm-2              0.00     0.00  0.00  0.00     0.00     0.00
>>> 0.00     0.00    0.00   0.00   0.00
>>> dm-3              0.00     0.00  0.00 24.80     0.00   198.40
>>> 8.00     0.24    9.80   0.13   0.32
>>> dm-4              0.00     0.00  0.00  6.60     0.00    52.80
>>> 8.00     0.01    1.36   0.55   0.36
>>> dm-5              0.00     0.00  0.00  0.00     0.00     0.00
>>> 0.00     0.00    0.00   0.00   0.00
>>> dm-6              0.00     0.00  0.00 24.80     0.00   198.40
>>> 8.00     0.29   11.60   0.13   0.32
>>>
>>>
>>>
>>> I can see I am cpu bound here but couldn't figure out exactly what is
>>> causing it, is this caused by GC or Compaction ? I am thinking it is
>>> compaction, I see a lot of context switches and interrupts in my vmstat
>>> output.
>>>
>>> I don't see GC activity in the logs but see some compaction activity.
>>> Has anyone seen this ? or know what can be done to free up the CPU.
>>>
>>> Thanks,
>>> Sandeep
>>>
>>>
>>>
>>
>

--001a11c1e384a345c104ecdfb8dd
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hi srmore,</div><div><br></div>Perhaps if you use jco=
nsole and connect to the jvm using jmx. Then uner MBeans tab, start inspect=
ing the GC metrics.<br><div><br></div><div>/Jason</div></div><div class=3D"=
gmail_extra">
<br><br><div class=3D"gmail_quote">On Fri, Dec 6, 2013 at 11:40 PM, srmore =
<span dir=3D"ltr">&lt;<a href=3D"mailto:comomore@gmail.com" target=3D"_blan=
k">comomore@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q=
uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e=
x">
<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote"><div class=3D"im">On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <span d=
ir=3D"ltr">&lt;<a href=3D"mailto:vicky.kak@gmail.com" target=3D"_blank">vic=
ky.kak@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><di=
v>Hard to say much without knowing about the cassandra configurations.<br><=
/div>

</div></div></blockquote><div>=A0</div></div><div><span style=3D"text-inden=
t:0px;letter-spacing:normal;font-variant:normal;text-align:start;font-style=
:normal;display:inline!important;font-weight:normal;float:none;line-height:=
normal;color:rgb(34,34,34);text-transform:none;font-size:small;white-space:=
normal;font-family:arial;word-spacing:0px">The cassandra configuration is=
=A0</span><div style=3D"color:rgb(34,34,34);font-family:arial;font-size:sma=
ll;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:=
normal;line-height:normal;text-align:start;text-indent:0px;text-transform:n=
one;white-space:normal;word-spacing:0px">

-Xms8G<br></div><div style=3D"color:rgb(34,34,34);font-family:arial;font-si=
ze:small;font-style:normal;font-variant:normal;font-weight:normal;letter-sp=
acing:normal;line-height:normal;text-align:start;text-indent:0px;text-trans=
form:none;white-space:normal;word-spacing:0px">

-Xmx8G<br></div><div style=3D"color:rgb(34,34,34);font-family:arial;font-si=
ze:small;font-style:normal;font-variant:normal;font-weight:normal;letter-sp=
acing:normal;line-height:normal;text-align:start;text-indent:0px;text-trans=
form:none;white-space:normal;word-spacing:0px">

-Xmn800m<br></div><div style=3D"color:rgb(34,34,34);font-family:arial;font-=
size:small;font-style:normal;font-variant:normal;font-weight:normal;letter-=
spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-tra=
nsform:none;white-space:normal;word-spacing:0px">

-XX:+UseParNewGC<br></div><div style=3D"color:rgb(34,34,34);font-family:ari=
al;font-size:small;font-style:normal;font-variant:normal;font-weight:normal=
;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;=
text-transform:none;white-space:normal;word-spacing:0px">

-XX:+UseConcMarkSweepGC<br></div><div style=3D"color:rgb(34,34,34);font-fam=
ily:arial;font-size:small;font-style:normal;font-variant:normal;font-weight=
:normal;letter-spacing:normal;line-height:normal;text-align:start;text-inde=
nt:0px;text-transform:none;white-space:normal;word-spacing:0px">

-XX:+CMSParallelRemarkEnabled<br></div><div style=3D"color:rgb(34,34,34);fo=
nt-family:arial;font-size:small;font-style:normal;font-variant:normal;font-=
weight:normal;letter-spacing:normal;line-height:normal;text-align:start;tex=
t-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">

-XX:SurvivorRatio=3D4<br></div><div style=3D"color:rgb(34,34,34);font-famil=
y:arial;font-size:small;font-style:normal;font-variant:normal;font-weight:n=
ormal;letter-spacing:normal;line-height:normal;text-align:start;text-indent=
:0px;text-transform:none;white-space:normal;word-spacing:0px">

-XX:MaxTenuringThreshold=3D2<br></div><div style=3D"color:rgb(34,34,34);fon=
t-family:arial;font-size:small;font-style:normal;font-variant:normal;font-w=
eight:normal;letter-spacing:normal;line-height:normal;text-align:start;text=
-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">

-XX:CMSInitiatingOccupancyFraction=3D75<br></div><div style=3D"color:rgb(34=
,34,34);font-family:arial;font-size:small;font-style:normal;font-variant:no=
rmal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align=
:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:=
0px">

-XX:+UseCMSInitiatingOccupancyOnly<br></div><br>=A0</div><div class=3D"im">=
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><di=
v></div>
Yes compactions/GC&#39;s could skipe the CPU, I had similar behavior with m=
y setup.<span><font color=3D"#888888"><br>
</font></span></div></div></blockquote><div><br></div></div><div>Were you a=
ble to get around it ?<br></div><div><div class=3D"h5"><div>=A0</div><block=
quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1=
px solid rgb(204,204,204);padding-left:1ex">

<div dir=3D"ltr"><div><span><font color=3D"#888888"><br></font></span></div=
><span><font color=3D"#888888">-VK<br></font></span></div><div><div><div cl=
ass=3D"gmail_extra">
<br><br><div class=3D"gmail_quote">On Fri, Dec 6, 2013 at 7:40 PM, srmore <=
span dir=3D"ltr">&lt;<a href=3D"mailto:comomore@gmail.com" target=3D"_blank=
">comomore@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,20=
4);padding-left:1ex">


<div dir=3D"ltr"><div><div>We have a 3 node cluster running cassandra 1.2.1=
2, they are pretty big machines 64G ram with 16 cores, cassandra heap is 8G=
. <br><br></div>The interesting observation is that, when I send traffic to=
 one node its performance is 2x more than when I send traffic to all the no=
des. We ran 1.0.11 on the same box and we observed a slight dip but not hal=
f as seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.=
 Changing CL to ONE make a slight improvement but not much.<br>


<br></div>The read_Repair_chance is 0.1. We see some compactions running.<b=
r><br><div><div><div><div>following is my iostat -x output, sda is the ssd =
(for commit log) and sdb is the spinner.<br><br>avg-cpu:=A0 %user=A0=A0 %ni=
ce %system %iowait=A0 %steal=A0=A0 %idle<br>


=A0=A0=A0=A0=A0=A0=A0=A0=A0 66.46=A0=A0=A0 0.00=A0=A0=A0 8.95=A0=A0=A0 0.01=
=A0=A0=A0 0.00=A0=A0 24.58<br><br>Device:=A0=A0=A0=A0=A0=A0=A0=A0 rrqm/s=A0=
=A0 wrqm/s=A0=A0 r/s=A0=A0 w/s=A0=A0 rsec/s=A0=A0 wsec/s avgrq-sz avgqu-sz=
=A0=A0 await=A0 svctm=A0 %util<br>sda=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0 0.00=A0=A0=A0 27.60=A0 0.00=A0 4.40=A0=A0=A0=A0 0.00=A0=A0 256.00=A0=
=A0=A0 58.18=A0=A0=A0=A0 0.01=A0=A0=A0 2.55=A0=A0 1.32=A0=A0 0.58<br>


sda1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>sda2=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0 27.60=A0 0.00=A0 4.40=A0=A0=A0=A0 0.00=A0=A0 256.00=
=A0=A0=A0 58.18=A0=A0=A0=A0 0.01=A0=A0=A0 2.55=A0=A0 1.32=A0=A0 0.58<br>sdb=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0=
 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=
=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>


sdb1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>dm-0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0=
.00<br>dm-1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=
 0.00=A0 0.60=A0=A0=A0=A0 0.00=A0=A0=A0=A0 4.80=A0=A0=A0=A0 8.00=A0=A0=A0=
=A0 0.00=A0=A0=A0 5.33=A0=A0 2.67=A0=A0 0.16<br>


dm-2=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>dm-3=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00 24.80=A0=A0=A0=A0 0.00=A0=A0 198.40=
=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.24=A0=A0=A0 9.80=A0=A0 0.13=A0=A0 0.32<br>d=
m-4=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0=
 6.60=A0=A0=A0=A0 0.00=A0=A0=A0 52.80=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.01=A0=
=A0=A0 1.36=A0=A0 0.55=A0=A0 0.36<br>


dm-5=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>dm-6=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00 24.80=A0=A0=A0=A0 0.00=A0=A0 198.40=
=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.29=A0=A0 11.60=A0=A0 0.13=A0=A0 0.32<br><br=
><br><br>


</div><div>I can see I am cpu bound here but couldn&#39;t figure out exactl=
y what is causing it, is this caused by GC or Compaction ? I am thinking it=
 is compaction, I see a lot of context switches and interrupts in my vmstat=
 output.<br>


<br></div><div>I don&#39;t see GC activity in the logs but see some compact=
ion activity. Has anyone seen this ? or know what can be done to free up th=
e CPU.<br><br></div><div>Thanks,<br>Sandeep<br></div><div><br></div><div>


<br></div></div></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>

--001a11c1e384a345c104ecdfb8dd--