Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of vicky.kak@gmail.com
 designates 209.85.128.181 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAP7WDFUHor+C6U355a=AQmKLJQkpfaZNyrjfSdTOB+OZfb9k4w@mail.gmail.com>
References: 
 <CAP7WDFV=DHPEaDcarOsktuw_sxGFeYvZF5eMAaZNYOoU8OEz-A@mail.gmail.com>
	<CAPaCpY8BjMM+NcP6RJg+dsBBZnpXQKxj53DQjTjGCB62Rrjk5g@mail.gmail.com>
	<CAP7WDFWxzxG_5k8XgB1Uc5Pj7OxHbXCaKJnDB7Oa8tY7M8nNMg@mail.gmail.com>
	<CAHO4itzGiVksxT-bGQpFExfW8QmvxT_ckX=u--RM4xkaLGwDHQ@mail.gmail.com>
	<CAP7WDFUHor+C6U355a=AQmKLJQkpfaZNyrjfSdTOB+OZfb9k4w@mail.gmail.com>
Date: Fri, 6 Dec 2013 22:45:12 +0530
Message-ID: 
 <CAPaCpY_rtHfi+RXVovFyntoUj7-Ys80WEvUth86byvM9s=VDLA@mail.gmail.com>
Subject: Re: Write performance with 1.2.12
From: Vicky Kak <vicky.kak@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a11c2dd7efb60a004ece0c9f0

--001a11c2dd7efb60a004ece0c9f0
Content-Type: text/plain; charset=ISO-8859-1

Since how long the server had been up, hours,days,months....?


On Fri, Dec 6, 2013 at 10:41 PM, srmore <comomore@gmail.com> wrote:

> Looks like I am spending some time in GC.
>
> java.lang:type=GarbageCollector,name=ConcurrentMarkSweep
>
> CollectionTime = 51707;
> CollectionCount = 103;
>
> java.lang:type=GarbageCollector,name=ParNew
>
>  CollectionTime = 466835;
>  CollectionCount = 21315;
>
>
> On Fri, Dec 6, 2013 at 9:58 AM, Jason Wee <peichieh@gmail.com> wrote:
>
>> Hi srmore,
>>
>> Perhaps if you use jconsole and connect to the jvm using jmx. Then uner
>> MBeans tab, start inspecting the GC metrics.
>>
>> /Jason
>>
>>
>> On Fri, Dec 6, 2013 at 11:40 PM, srmore <comomore@gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
>>>
>>>> Hard to say much without knowing about the cassandra configurations.
>>>>
>>>
>>> The cassandra configuration is
>>> -Xms8G
>>> -Xmx8G
>>> -Xmn800m
>>> -XX:+UseParNewGC
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+CMSParallelRemarkEnabled
>>> -XX:SurvivorRatio=4
>>> -XX:MaxTenuringThreshold=2
>>> -XX:CMSInitiatingOccupancyFraction=75
>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>
>>>
>>>
>>>> Yes compactions/GC's could skipe the CPU, I had similar behavior with
>>>> my setup.
>>>>
>>>
>>> Were you able to get around it ?
>>>
>>>
>>>>
>>>> -VK
>>>>
>>>>
>>>> On Fri, Dec 6, 2013 at 7:40 PM, srmore <comomore@gmail.com> wrote:
>>>>
>>>>> We have a 3 node cluster running cassandra 1.2.12, they are pretty big
>>>>> machines 64G ram with 16 cores, cassandra heap is 8G.
>>>>>
>>>>> The interesting observation is that, when I send traffic to one node
>>>>> its performance is 2x more than when I send traffic to all the nodes. We
>>>>> ran 1.0.11 on the same box and we observed a slight dip but not half as
>>>>> seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
>>>>> Changing CL to ONE make a slight improvement but not much.
>>>>>
>>>>> The read_Repair_chance is 0.1. We see some compactions running.
>>>>>
>>>>> following is my iostat -x output, sda is the ssd (for commit log) and
>>>>> sdb is the spinner.
>>>>>
>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>           66.46    0.00    8.95    0.01    0.00   24.58
>>>>>
>>>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>>>>> avgqu-sz   await  svctm  %util
>>>>> sda               0.00    27.60  0.00  4.40     0.00   256.00
>>>>> 58.18     0.01    2.55   1.32   0.58
>>>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00
>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>> sda2              0.00    27.60  0.00  4.40     0.00   256.00
>>>>> 58.18     0.01    2.55   1.32   0.58
>>>>> sdb               0.00     0.00  0.00  0.00     0.00     0.00
>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>> sdb1              0.00     0.00  0.00  0.00     0.00     0.00
>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>> dm-0              0.00     0.00  0.00  0.00     0.00     0.00
>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>> dm-1              0.00     0.00  0.00  0.60     0.00     4.80
>>>>> 8.00     0.00    5.33   2.67   0.16
>>>>> dm-2              0.00     0.00  0.00  0.00     0.00     0.00
>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>> dm-3              0.00     0.00  0.00 24.80     0.00   198.40
>>>>> 8.00     0.24    9.80   0.13   0.32
>>>>> dm-4              0.00     0.00  0.00  6.60     0.00    52.80
>>>>> 8.00     0.01    1.36   0.55   0.36
>>>>> dm-5              0.00     0.00  0.00  0.00     0.00     0.00
>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>> dm-6              0.00     0.00  0.00 24.80     0.00   198.40
>>>>> 8.00     0.29   11.60   0.13   0.32
>>>>>
>>>>>
>>>>>
>>>>> I can see I am cpu bound here but couldn't figure out exactly what is
>>>>> causing it, is this caused by GC or Compaction ? I am thinking it is
>>>>> compaction, I see a lot of context switches and interrupts in my vmstat
>>>>> output.
>>>>>
>>>>> I don't see GC activity in the logs but see some compaction activity.
>>>>> Has anyone seen this ? or know what can be done to free up the CPU.
>>>>>
>>>>> Thanks,
>>>>> Sandeep
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

--001a11c2dd7efb60a004ece0c9f0
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Since how long the server had been up, hours,days,months..=
..?<br></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">=
On Fri, Dec 6, 2013 at 10:41 PM, srmore <span dir=3D"ltr">&lt;<a href=3D"ma=
ilto:comomore@gmail.com" target=3D"_blank">comomore@gmail.com</a>&gt;</span=
> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Looks like I am spending so=
me time in GC.<br><div><div><br>java.lang:type=3DGarbageCollector,name=3DCo=
ncurrentMarkSweep<br>
<br>CollectionTime =3D 51707;<br>CollectionCount =3D 103;<br>=A0<br>java.la=
ng:type=3DGarbageCollector,name=3DParNew<br>
=A0<br>=A0CollectionTime =3D 466835;<br>=A0CollectionCount =3D 21315;<br></=
div></div></div><div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail=
_extra"><br><br><div class=3D"gmail_quote">On Fri, Dec 6, 2013 at 9:58 AM, =
Jason Wee <span dir=3D"ltr">&lt;<a href=3D"mailto:peichieh@gmail.com" targe=
t=3D"_blank">peichieh@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Hi srmore,</div><div><=
br></div>Perhaps if you use jconsole and connect to the jvm using jmx. Then=
 uner MBeans tab, start inspecting the GC metrics.<span><font color=3D"#888=
888"><br>

<div><br></div><div>/Jason</div></font></span></div><div><div><div class=3D=
"gmail_extra">
<br><br><div class=3D"gmail_quote">On Fri, Dec 6, 2013 at 11:40 PM, srmore =
<span dir=3D"ltr">&lt;<a href=3D"mailto:comomore@gmail.com" target=3D"_blan=
k">comomore@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_q=
uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e=
x">


<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote"><div>On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <span dir=3D"ltr">&l=
t;<a href=3D"mailto:vicky.kak@gmail.com" target=3D"_blank">vicky.kak@gmail.=
com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><di=
v>Hard to say much without knowing about the cassandra configurations.<br><=
/div>


</div></div></blockquote><div>=A0</div></div><div><span style=3D"text-inden=
t:0px;letter-spacing:normal;font-variant:normal;text-align:start;font-style=
:normal;display:inline!important;font-weight:normal;float:none;line-height:=
normal;color:rgb(34,34,34);text-transform:none;font-size:small;white-space:=
normal;font-family:arial;word-spacing:0px">The cassandra configuration is=
=A0</span><div style=3D"color:rgb(34,34,34);font-family:arial;font-size:sma=
ll;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:=
normal;line-height:normal;text-align:start;text-indent:0px;text-transform:n=
one;white-space:normal;word-spacing:0px">


-Xms8G<br></div><div style=3D"color:rgb(34,34,34);font-family:arial;font-si=
ze:small;font-style:normal;font-variant:normal;font-weight:normal;letter-sp=
acing:normal;line-height:normal;text-align:start;text-indent:0px;text-trans=
form:none;white-space:normal;word-spacing:0px">


-Xmx8G<br></div><div style=3D"color:rgb(34,34,34);font-family:arial;font-si=
ze:small;font-style:normal;font-variant:normal;font-weight:normal;letter-sp=
acing:normal;line-height:normal;text-align:start;text-indent:0px;text-trans=
form:none;white-space:normal;word-spacing:0px">


-Xmn800m<br></div><div style=3D"color:rgb(34,34,34);font-family:arial;font-=
size:small;font-style:normal;font-variant:normal;font-weight:normal;letter-=
spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-tra=
nsform:none;white-space:normal;word-spacing:0px">


-XX:+UseParNewGC<br></div><div style=3D"color:rgb(34,34,34);font-family:ari=
al;font-size:small;font-style:normal;font-variant:normal;font-weight:normal=
;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;=
text-transform:none;white-space:normal;word-spacing:0px">


-XX:+UseConcMarkSweepGC<br></div><div style=3D"color:rgb(34,34,34);font-fam=
ily:arial;font-size:small;font-style:normal;font-variant:normal;font-weight=
:normal;letter-spacing:normal;line-height:normal;text-align:start;text-inde=
nt:0px;text-transform:none;white-space:normal;word-spacing:0px">


-XX:+CMSParallelRemarkEnabled<br></div><div style=3D"color:rgb(34,34,34);fo=
nt-family:arial;font-size:small;font-style:normal;font-variant:normal;font-=
weight:normal;letter-spacing:normal;line-height:normal;text-align:start;tex=
t-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">


-XX:SurvivorRatio=3D4<br></div><div style=3D"color:rgb(34,34,34);font-famil=
y:arial;font-size:small;font-style:normal;font-variant:normal;font-weight:n=
ormal;letter-spacing:normal;line-height:normal;text-align:start;text-indent=
:0px;text-transform:none;white-space:normal;word-spacing:0px">


-XX:MaxTenuringThreshold=3D2<br></div><div style=3D"color:rgb(34,34,34);fon=
t-family:arial;font-size:small;font-style:normal;font-variant:normal;font-w=
eight:normal;letter-spacing:normal;line-height:normal;text-align:start;text=
-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">


-XX:CMSInitiatingOccupancyFraction=3D75<br></div><div style=3D"color:rgb(34=
,34,34);font-family:arial;font-size:small;font-style:normal;font-variant:no=
rmal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align=
:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:=
0px">


-XX:+UseCMSInitiatingOccupancyOnly<br></div><br>=A0</div><div><blockquote c=
lass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px soli=
d rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><div></div>
Yes compactions/GC&#39;s could skipe the CPU, I had similar behavior with m=
y setup.<span><font color=3D"#888888"><br>
</font></span></div></div></blockquote><div><br></div></div><div>Were you a=
ble to get around it ?<br></div><div><div><div>=A0</div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex">


<div dir=3D"ltr"><div><span><font color=3D"#888888"><br></font></span></div=
><span><font color=3D"#888888">-VK<br></font></span></div><div><div><div cl=
ass=3D"gmail_extra">
<br><br><div class=3D"gmail_quote">On Fri, Dec 6, 2013 at 7:40 PM, srmore <=
span dir=3D"ltr">&lt;<a href=3D"mailto:comomore@gmail.com" target=3D"_blank=
">comomore@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,20=
4);padding-left:1ex">


<div dir=3D"ltr"><div><div>We have a 3 node cluster running cassandra 1.2.1=
2, they are pretty big machines 64G ram with 16 cores, cassandra heap is 8G=
. <br><br></div>The interesting observation is that, when I send traffic to=
 one node its performance is 2x more than when I send traffic to all the no=
des. We ran 1.0.11 on the same box and we observed a slight dip but not hal=
f as seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.=
 Changing CL to ONE make a slight improvement but not much.<br>


<br></div>The read_Repair_chance is 0.1. We see some compactions running.<b=
r><br><div><div><div><div>following is my iostat -x output, sda is the ssd =
(for commit log) and sdb is the spinner.<br><br>avg-cpu:=A0 %user=A0=A0 %ni=
ce %system %iowait=A0 %steal=A0=A0 %idle<br>


=A0=A0=A0=A0=A0=A0=A0=A0=A0 66.46=A0=A0=A0 0.00=A0=A0=A0 8.95=A0=A0=A0 0.01=
=A0=A0=A0 0.00=A0=A0 24.58<br><br>Device:=A0=A0=A0=A0=A0=A0=A0=A0 rrqm/s=A0=
=A0 wrqm/s=A0=A0 r/s=A0=A0 w/s=A0=A0 rsec/s=A0=A0 wsec/s avgrq-sz avgqu-sz=
=A0=A0 await=A0 svctm=A0 %util<br>sda=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0 0.00=A0=A0=A0 27.60=A0 0.00=A0 4.40=A0=A0=A0=A0 0.00=A0=A0 256.00=A0=
=A0=A0 58.18=A0=A0=A0=A0 0.01=A0=A0=A0 2.55=A0=A0 1.32=A0=A0 0.58<br>


sda1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>sda2=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0 27.60=A0 0.00=A0 4.40=A0=A0=A0=A0 0.00=A0=A0 256.00=
=A0=A0=A0 58.18=A0=A0=A0=A0 0.01=A0=A0=A0 2.55=A0=A0 1.32=A0=A0 0.58<br>sdb=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0=
 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=
=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>


sdb1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>dm-0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0=
.00<br>dm-1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=
 0.00=A0 0.60=A0=A0=A0=A0 0.00=A0=A0=A0=A0 4.80=A0=A0=A0=A0 8.00=A0=A0=A0=
=A0 0.00=A0=A0=A0 5.33=A0=A0 2.67=A0=A0 0.16<br>


dm-2=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>dm-3=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00 24.80=A0=A0=A0=A0 0.00=A0=A0 198.40=
=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.24=A0=A0=A0 9.80=A0=A0 0.13=A0=A0 0.32<br>d=
m-4=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0=
 6.60=A0=A0=A0=A0 0.00=A0=A0=A0 52.80=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.01=A0=
=A0=A0 1.36=A0=A0 0.55=A0=A0 0.36<br>


dm-5=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=
=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00<br>dm-6=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00 24.80=A0=A0=A0=A0 0.00=A0=A0 198.40=
=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.29=A0=A0 11.60=A0=A0 0.13=A0=A0 0.32<br><br=
><br><br>


</div><div>I can see I am cpu bound here but couldn&#39;t figure out exactl=
y what is causing it, is this caused by GC or Compaction ? I am thinking it=
 is compaction, I see a lot of context switches and interrupts in my vmstat=
 output.<br>


<br></div><div>I don&#39;t see GC activity in the logs but see some compact=
ion activity. Has anyone seen this ? or know what can be done to free up th=
e CPU.<br><br></div><div>Thanks,<br>Sandeep<br></div><div><br></div><div>


<br></div></div></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--001a11c2dd7efb60a004ece0c9f0--