Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Date: Thu, 18 Feb 2016 17:22:37 +0000 (UTC)
From: Anuj Wadehra <anujw_2003@yahoo.co.in>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Message-ID: <1931757529.5253989.1455816157535.JavaMail.yahoo@mail.yahoo.com>
In-Reply-To: 
 <CAOac0GCLu8Bcuqr0ve0wojxjw5d50KtOY3OKafgdKE2GbCAE6Q@mail.gmail.com>
References: 
 <CAOac0GCLu8Bcuqr0ve0wojxjw5d50KtOY3OKafgdKE2GbCAE6Q@mail.gmail.com>
Subject: Re: Debugging write timeouts on Cassandra 2.2.5
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_5253988_1618601814.1455816157526"

------=_Part_5253988_1618601814.1455816157526
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Whats the GC overhead? Can you your share your GC collector and settings ?

Whats your query pattern? Do you use secondary indexes, batches, in clause =
etc?

Anuj

Sent from Yahoo Mail on Android=20
=20
  On Thu, 18 Feb, 2016 at 8:45 pm, Mike Heffner<mike@librato.com> wrote:   =
Alain,
Thanks for the suggestions.

Sure, tpstats are here:=C2=A0https://gist.github.com/mheffner/a979ae1a03044=
80b052a. Looking at the metrics across the ring, there were no blocked task=
s nor dropped messages.
Iowait metrics look fine, so it doesn't appear to be blocking on disk. Simi=
larly, there are no long GC pauses.
We haven't noticed latency on any particular table higher than others or co=
rrelated around the occurrence of a timeout. We have noticed with further t=
esting that running cassandra-stress against the ring, while our workload i=
s writing to the same ring, will incur similar 10 second timeouts. If our w=
orkload is not writing to the ring, cassandra stress will run without hitti=
ng timeouts. This seems to imply that our workload pattern is causing somet=
hing to block cluster-wide, since the stress tool writes to a different key=
space then our workload.
I mentioned in another reply that we've tracked it to something between 2.0=
.x and 2.1.x, so we are focusing on narrowing which point release it was in=
troduced in.
Cheers,
Mike
On Thu, Feb 18, 2016 at 3:33 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote=
:

Hi Mike,
What about the output of tpstats ? I imagine you have dropped messages ther=
e. Any blocked threads ? Could you paste this output here ?
May this be due to some network hiccup to access the disks as they are EBS =
? Can you think of anyway of checking this ? Do you have a lot of GC logs, =
how long are the pauses (use something like: grep -i 'GCInspector' /var/log=
/cassandra/system.log) ?
Something else you could check are local_writes stats to see if only one ta=
ble if affected or this is keyspace / cluster wide. You can use metrics exp=
osed by cassandra or if you have no dashboards I believe a: 'nodetool cfsta=
ts <myks> | grep -e 'Table:' -e 'Local'' should give you a rough idea of lo=
cal latencies.
Those are just things I would check, I have not a clue on what is happening=
 here, hope this will help.
C*heers,-----------------Alain RodriguezFrance
The Last Picklehttp://www.thelastpickle.com
2016-02-18 5:13 GMT+01:00 Mike Heffner <mike@librato.com>:

Jaydeep,
No, we don't use any light weight transactions.
Mike
On Wed, Feb 17, 2016 at 6:44 PM, Jaydeep Chovatia <chovatia.jaydeep@gmail.c=
om> wrote:

Are you guys using light weight transactions in your write path?
On Thu, Feb 11, 2016 at 12:36 AM, Fabrice Facorat <fabrice.facorat@gmail.co=
m> wrote:

Are your commitlog and data on the same disk ? If yes, you should put
commitlogs on a separate disk which don't have a lot of IO.

Others IO may have great impact impact on your commitlog writing and
it may even block.

An example of impact IO may have, even for Async writes:
https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-paus=
es-caused-by-background-io-traffic

2016-02-11 0:31 GMT+01:00 Mike Heffner <mike@librato.com>:
> Jeff,
>
> We have both commitlog and data on a 4TB EBS with 10k IOPS.
>
> Mike
>
> On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com>
> wrote:
>>
>> What disk size are you using?
>>
>>
>>
>> From: Mike Heffner
>> Reply-To: "user@cassandra.apache.org"
>> Date: Wednesday, February 10, 2016 at 2:24 PM
>> To: "user@cassandra.apache.org"
>> Cc: Peter Norton
>> Subject: Re: Debugging write timeouts on Cassandra 2.2.5
>>
>> Paulo,
>>
>> Thanks for the suggestion, we ran some tests against CMS and saw the sam=
e
>> timeouts. On that note though, we are going to try doubling the instance
>> sizes and testing with double the heap (even though current usage is low=
).
>>
>> Mike
>>
>> On Wed, Feb 10, 2016 at 3:40 PM, Paulo Motta <pauloricardomg@gmail.com>
>> wrote:
>>>
>>> Are you using the same GC settings as the staging 2.0 cluster? If not,
>>> could you try using the default GC settings (CMS) and see if that chang=
es
>>> anything? This is just a wild guess, but there were reports before of
>>> G1-caused instabilities with small heap sizes (< 16GB - see CASSANDRA-1=
0403
>>> for more context). Please ignore if you already tried reverting back to=
 CMS.
>>>
>>> 2016-02-10 16:51 GMT-03:00 Mike Heffner <mike@librato.com>:
>>>>
>>>> Hi all,
>>>>
>>>> We've recently embarked on a project to update our Cassandra
>>>> infrastructure running on EC2. We are long time users of 2.0.x and are
>>>> testing out a move to version 2.2.5 running on VPC with EBS. Our test =
setup
>>>> is a 3 node, RF=3D3 cluster supporting a small write load (mirror of o=
ur
>>>> staging load).
>>>>
>>>> We are writing at QUORUM and while p95's look good compared to our
>>>> staging 2.0.x cluster, we are seeing frequent write operations that ti=
me out
>>>> at the max write_request_timeout_in_ms (10 seconds). CPU across the cl=
uster
>>>> is < 10% and EBS write load is < 100 IOPS. Cassandra is running with t=
he
>>>> Oracle JDK 8u60 and we're using G1GC and any GC pauses are less than 5=
00ms.
>>>>
>>>> We run on c4.2xl instances with GP2 EBS attached storage for data and
>>>> commitlog directories. The nodes are using EC2 enhanced networking and=
 have
>>>> the latest Intel network driver module. We are running on HVM instance=
s
>>>> using Ubuntu 14.04.2.
>>>>
>>>> Our schema is 5 tables, all with COMPACT STORAGE. Each table is simila=
r
>>>> to the definition here:
>>>> https://gist.github.com/mheffner/4d80f6b53ccaa24cc20a
>>>>
>>>> This is our cassandra.yaml:
>>>> https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-y=
aml
>>>>
>>>> Like I mentioned we use 8u60 with G1GC and have used many of the GC
>>>> settings in Al Tobey's tuning guide. This is our upstart config with J=
VM and
>>>> other CPU settings: https://gist.github.com/mheffner/dc44613620b25c4fa=
46d
>>>>
>>>> We've used several of the sysctl settings from Al's guide as well:
>>>> https://gist.github.com/mheffner/ea40d58f58a517028152
>>>>
>>>> Our client application is able to write using either Thrift batches
>>>> using Asytanax driver or CQL async INSERT's using the Datastax Java dr=
iver.
>>>>
>>>> For testing against Thrift (our legacy infra uses this) we write batch=
es
>>>> of anywhere from 6 to 1500 rows at a time. Our p99 for batch execution=
 is
>>>> around 45ms but our maximum (p100) sits less than 150ms except when it
>>>> periodically spikes to the full 10seconds.
>>>>
>>>> Testing the same write path using CQL writes instead demonstrates
>>>> similar behavior. Low p99s except for periodic full timeouts. We enabl=
ed
>>>> tracing for several operations but were unable to get a trace that com=
pleted
>>>> successfully -- Cassandra started logging many messages as:
>>>>
>>>> INFO=C2=A0 [ScheduledTasks:1] - MessagingService.java:946 - _TRACE mes=
sages
>>>> were dropped in last 5000 ms: 52499 for internal timeout and 0 for cro=
ss
>>>> node timeout
>>>>
>>>> And all the traces contained rows with a "null" source_elapsed row:
>>>> https://gist.githubusercontent.com/mheffner/1d68a70449bd6688a010/raw/0=
327d7d3d94c3a93af02b64212e3b7e7d8f2911b/trace.out
>>>>
>>>>
>>>> We've exhausted as many configuration option permutations that we can
>>>> think of. This cluster does not appear to be under any significant loa=
d and
>>>> latencies seem to largely fall in two bands: low normal or max timeout=
. This
>>>> seems to imply that something is getting stuck and timing out at the m=
ax
>>>> write timeout.
>>>>
>>>> Any suggestions on what to look for? We had debug enabled for awhile b=
ut
>>>> we didn't see any msg that pointed to something obvious. Happy to prov=
ide
>>>> any more information that may help.
>>>>
>>>> We are pretty much at the point of sprinkling debug around the code to
>>>> track down what could be blocking.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Mike
>>>>
>>>> --
>>>>
>>>>=C2=A0 =C2=A0Mike Heffner <mike@librato.com>
>>>>=C2=A0 =C2=A0Librato, Inc.
>>>>
>>>
>>
>>
>>
>> --
>>
>>=C2=A0 =C2=A0Mike Heffner <mike@librato.com>
>>=C2=A0 =C2=A0Librato, Inc.
>>
>
>
>
> --
>
>=C2=A0 =C2=A0Mike Heffner <mike@librato.com>
>=C2=A0 =C2=A0Librato, Inc.
>


--
Close the World, Open the Net
http://www.linux-wizard.net


--=20

=C2=A0=C2=A0Mike Heffner <mike@librato.com>=C2=A0=C2=A0Librato, Inc.


--=20

=C2=A0=C2=A0Mike Heffner <mike@librato.com>=C2=A0=C2=A0Librato, Inc.
 =20

------=_Part_5253988_1618601814.1455816157526
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Whats the GC overhead? Can you your share your GC collector and settings ?<=
div id=3D"yMail_cursorElementTracker_0.5958905699662864"><br></div><div id=
=3D"yMail_cursorElementTracker_0.5958905699662864"><br></div><div id=3D"yMa=
il_cursorElementTracker_0.5958905699662864">Whats your query pattern? Do yo=
u use secondary indexes, batches, in clause etc?</div><div id=3D"yMail_curs=
orElementTracker_0.5958905699662864"><br></div><div id=3D"yMail_cursorEleme=
ntTracker_0.5958905699662864"><br></div><div id=3D"yMail_cursorElementTrack=
er_0.5958905699662864">Anuj</div><div id=3D"yMail_cursorElementTracker_0.59=
58905699662864"><br><br><div><a href=3D"https://overview.mail.yahoo.com/mob=
ile/?.src=3DAndroid">Sent from Yahoo Mail on Android</a></div> <br> <blockq=
uote style=3D"margin: 0 0 20px 0;"> <header style=3D"font-family:Roboto, sa=
ns-serif; color:#6D00F6;"> <div>On Thu, 18 Feb, 2016 at 8:45 pm, Mike Heffn=
er</div><div>&lt;mike@librato.com&gt; wrote:</div> </header> <div style=3D"=
padding: 10px 0 0 20px; margin: 10px 0 0 0; border-left: 1px solid #6D00F6;=
"> <div dir=3D"ltr">Alain,<div><br clear=3D"none"></div><div>Thanks for the=
 suggestions.<br clear=3D"none"><div><br clear=3D"none"></div><div>Sure, tp=
stats are here:&nbsp;<a rel=3D"nofollow" shape=3D"rect" target=3D"_blank" h=
ref=3D"https://gist.github.com/mheffner/a979ae1a0304480b052a">https://gist.=
github.com/mheffner/a979ae1a0304480b052a</a>. Looking at the metrics across=
 the ring, there were no blocked tasks nor dropped messages.</div><div><br =
clear=3D"none"></div><div>Iowait metrics look fine, so it doesn't appear to=
 be blocking on disk. Similarly, there are no long GC pauses.</div><div><br=
 clear=3D"none"></div><div>We haven't noticed latency on any particular tab=
le higher than others or correlated around the occurrence of a timeout. We =
have noticed with further testing that running cassandra-stress against the=
 ring, while our workload is writing to the same ring, will incur similar 1=
0 second timeouts. If our workload is not writing to the ring, cassandra st=
ress will run without hitting timeouts. This seems to imply that our worklo=
ad pattern is causing something to block cluster-wide, since the stress too=
l writes to a different keyspace then our workload.</div></div><div><br cle=
ar=3D"none"></div><div>I mentioned in another reply that we've tracked it t=
o something between 2.0.x and 2.1.x, so we are focusing on narrowing which =
point release it was introduced in.</div><div><br clear=3D"none"></div><div=
>Cheers,</div><div><br clear=3D"none"></div><div>Mike</div></div><div class=
=3D"gmail_extra"><br clear=3D"none"><div class=3D"yQTDBase yqt2249179112" i=
d=3D"yqt42261"><div class=3D"gmail_quote">On Thu, Feb 18, 2016 at 3:33 AM, =
Alain RODRIGUEZ <span dir=3D"ltr">&lt;<a rel=3D"nofollow" shape=3D"rect" ym=
ailto=3D"mailto:arodrime@gmail.com" target=3D"_blank" href=3D"javascript:re=
turn">arodrime@gmail.com</a>&gt;</span> wrote:<br clear=3D"none"><blockquot=
e class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc sol=
id;padding-left:1ex;"><div dir=3D"ltr">Hi Mike,<div><br clear=3D"none"></di=
v><div>What about the output of tpstats ? I imagine you have dropped messag=
es there. Any blocked threads ? Could you paste this output here ?</div><di=
v><br clear=3D"none"></div><div>May this be due to some network hiccup to a=
ccess the disks as they are EBS ? Can you think of anyway of checking this =
? Do you have a lot of GC logs, how long are the pauses (use something like=
: grep -i 'GCInspector' /var/log/cassandra/system.log) ?</div><div><br clea=
r=3D"none"></div><div>Something else you could check are local_writes stats=
 to see if only one table if affected or this is keyspace / cluster wide. Y=
ou can use metrics exposed by cassandra or if you have no dashboards I beli=
eve a: 'nodetool cfstats &lt;myks&gt; | grep -e 'Table:' -e 'Local'' should=
 give you a rough idea of local latencies.</div>


<div><br clear=3D"none"></div><div>Those are just things I would check, I h=
ave not a clue on what is happening here, hope this will help.</div><div><b=
r clear=3D"none"></div><div>C*heers,</div><div><div>-----------------</div>=
<div>Alain Rodriguez</div><div>France</div><div><br clear=3D"none"></div><d=
iv>The Last Pickle</div><div><a rel=3D"nofollow" shape=3D"rect" target=3D"_=
blank" href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</=
a></div></div>


</div><div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail_extra"><b=
r clear=3D"none"><div class=3D"gmail_quote">2016-02-18 5:13 GMT+01:00 Mike =
Heffner <span dir=3D"ltr">&lt;<a rel=3D"nofollow" shape=3D"rect" ymailto=3D=
"mailto:mike@librato.com" target=3D"_blank" href=3D"javascript:return">mike=
@librato.com</a>&gt;</span>:<br clear=3D"none"><blockquote class=3D"gmail_q=
uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e=
x;"><div dir=3D"ltr">Jaydeep,<div><br clear=3D"none"></div><div>No, we don'=
t use any light weight transactions.</div><span><font color=3D"#888888"></f=
ont></span><div><br clear=3D"none"></div><div>Mike</div></div><div><div><di=
v class=3D"gmail_extra"><br clear=3D"none"><div class=3D"gmail_quote">On We=
d, Feb 17, 2016 at 6:44 PM, Jaydeep Chovatia <span dir=3D"ltr">&lt;<a rel=
=3D"nofollow" shape=3D"rect" ymailto=3D"mailto:chovatia.jaydeep@gmail.com" =
target=3D"_blank" href=3D"javascript:return">chovatia.jaydeep@gmail.com</a>=
&gt;</span> wrote:<br clear=3D"none"><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div d=
ir=3D"ltr">Are you guys using light weight transactions in your write path?=
</div><div><div><div class=3D"gmail_extra"><br clear=3D"none"><div class=3D=
"gmail_quote">On Thu, Feb 11, 2016 at 12:36 AM, Fabrice Facorat <span dir=
=3D"ltr">&lt;<a rel=3D"nofollow" shape=3D"rect" ymailto=3D"mailto:fabrice.f=
acorat@gmail.com" target=3D"_blank" href=3D"javascript:return">fabrice.faco=
rat@gmail.com</a>&gt;</span> wrote:<br clear=3D"none"><blockquote class=3D"=
gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-=
left:1ex;">Are your commitlog and data on the same disk ? If yes, you shoul=
d put<br clear=3D"none">
commitlogs on a separate disk which don't have a lot of IO.<br clear=3D"non=
e">
<br clear=3D"none">
Others IO may have great impact impact on your commitlog writing and<br cle=
ar=3D"none">
it may even block.<br clear=3D"none">
<br clear=3D"none">
An example of impact IO may have, even for Async writes:<br clear=3D"none">
<a rel=3D"nofollow" shape=3D"rect" target=3D"_blank" href=3D"https://engine=
ering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-b=
ackground-io-traffic">https://engineering.linkedin.com/blog/2016/02/elimina=
ting-large-jvm-gc-pauses-caused-by-background-io-traffic</a><br clear=3D"no=
ne">
<div><div><br clear=3D"none">
2016-02-11 0:31 GMT+01:00 Mike Heffner &lt;<a rel=3D"nofollow" shape=3D"rec=
t" ymailto=3D"mailto:mike@librato.com" target=3D"_blank" href=3D"javascript=
:return">mike@librato.com</a>&gt;:<br clear=3D"none">
&gt; Jeff,<br clear=3D"none">
&gt;<br clear=3D"none">
&gt; We have both commitlog and data on a 4TB EBS with 10k IOPS.<br clear=
=3D"none">
&gt;<br clear=3D"none">
&gt; Mike<br clear=3D"none">
&gt;<br clear=3D"none">
&gt; On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa &lt;<a rel=3D"nofollow" sh=
ape=3D"rect" ymailto=3D"mailto:jeff.jirsa@crowdstrike.com" target=3D"_blank=
" href=3D"javascript:return">jeff.jirsa@crowdstrike.com</a>&gt;<br clear=3D=
"none">
&gt; wrote:<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt; What disk size are you using?<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt; From: Mike Heffner<br clear=3D"none">
&gt;&gt; Reply-To: "<a rel=3D"nofollow" shape=3D"rect" ymailto=3D"mailto:us=
er@cassandra.apache.org" target=3D"_blank" href=3D"javascript:return">user@=
cassandra.apache.org</a>"<br clear=3D"none">
&gt;&gt; Date: Wednesday, February 10, 2016 at 2:24 PM<br clear=3D"none">
&gt;&gt; To: "<a rel=3D"nofollow" shape=3D"rect" ymailto=3D"mailto:user@cas=
sandra.apache.org" target=3D"_blank" href=3D"javascript:return">user@cassan=
dra.apache.org</a>"<br clear=3D"none">
&gt;&gt; Cc: Peter Norton<br clear=3D"none">
&gt;&gt; Subject: Re: Debugging write timeouts on Cassandra 2.2.5<br clear=
=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt; Paulo,<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt; Thanks for the suggestion, we ran some tests against CMS and saw t=
he same<br clear=3D"none">
&gt;&gt; timeouts. On that note though, we are going to try doubling the in=
stance<br clear=3D"none">
&gt;&gt; sizes and testing with double the heap (even though current usage =
is low).<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt; Mike<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt; On Wed, Feb 10, 2016 at 3:40 PM, Paulo Motta &lt;<a rel=3D"nofollo=
w" shape=3D"rect" ymailto=3D"mailto:pauloricardomg@gmail.com" target=3D"_bl=
ank" href=3D"javascript:return">pauloricardomg@gmail.com</a>&gt;<br clear=
=3D"none">
&gt;&gt; wrote:<br clear=3D"none">
&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt; Are you using the same GC settings as the staging 2.0 cluster?=
 If not,<br clear=3D"none">
&gt;&gt;&gt; could you try using the default GC settings (CMS) and see if t=
hat changes<br clear=3D"none">
&gt;&gt;&gt; anything? This is just a wild guess, but there were reports be=
fore of<br clear=3D"none">
&gt;&gt;&gt; G1-caused instabilities with small heap sizes (&lt; 16GB - see=
 CASSANDRA-10403<br clear=3D"none">
&gt;&gt;&gt; for more context). Please ignore if you already tried revertin=
g back to CMS.<br clear=3D"none">
&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt; 2016-02-10 16:51 GMT-03:00 Mike Heffner &lt;<a rel=3D"nofollow=
" shape=3D"rect" ymailto=3D"mailto:mike@librato.com" target=3D"_blank" href=
=3D"javascript:return">mike@librato.com</a>&gt;:<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Hi all,<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; We've recently embarked on a project to update our Cassand=
ra<br clear=3D"none">
&gt;&gt;&gt;&gt; infrastructure running on EC2. We are long time users of 2=
.0.x and are<br clear=3D"none">
&gt;&gt;&gt;&gt; testing out a move to version 2.2.5 running on VPC with EB=
S. Our test setup<br clear=3D"none">
&gt;&gt;&gt;&gt; is a 3 node, RF=3D3 cluster supporting a small write load =
(mirror of our<br clear=3D"none">
&gt;&gt;&gt;&gt; staging load).<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; We are writing at QUORUM and while p95's look good compare=
d to our<br clear=3D"none">
&gt;&gt;&gt;&gt; staging 2.0.x cluster, we are seeing frequent write operat=
ions that time out<br clear=3D"none">
&gt;&gt;&gt;&gt; at the max write_request_timeout_in_ms (10 seconds). CPU a=
cross the cluster<br clear=3D"none">
&gt;&gt;&gt;&gt; is &lt; 10% and EBS write load is &lt; 100 IOPS. Cassandra=
 is running with the<br clear=3D"none">
&gt;&gt;&gt;&gt; Oracle JDK 8u60 and we're using G1GC and any GC pauses are=
 less than 500ms.<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; We run on c4.2xl instances with GP2 EBS attached storage f=
or data and<br clear=3D"none">
&gt;&gt;&gt;&gt; commitlog directories. The nodes are using EC2 enhanced ne=
tworking and have<br clear=3D"none">
&gt;&gt;&gt;&gt; the latest Intel network driver module. We are running on =
HVM instances<br clear=3D"none">
&gt;&gt;&gt;&gt; using Ubuntu 14.04.2.<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Our schema is 5 tables, all with COMPACT STORAGE. Each tab=
le is similar<br clear=3D"none">
&gt;&gt;&gt;&gt; to the definition here:<br clear=3D"none">
&gt;&gt;&gt;&gt; <a rel=3D"nofollow" shape=3D"rect" target=3D"_blank" href=
=3D"https://gist.github.com/mheffner/4d80f6b53ccaa24cc20a">https://gist.git=
hub.com/mheffner/4d80f6b53ccaa24cc20a</a><br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; This is our cassandra.yaml:<br clear=3D"none">
&gt;&gt;&gt;&gt; <a rel=3D"nofollow" shape=3D"rect" target=3D"_blank" href=
=3D"https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-ya=
ml">https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-ya=
ml</a><br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Like I mentioned we use 8u60 with G1GC and have used many =
of the GC<br clear=3D"none">
&gt;&gt;&gt;&gt; settings in Al Tobey's tuning guide. This is our upstart c=
onfig with JVM and<br clear=3D"none">
&gt;&gt;&gt;&gt; other CPU settings: <a rel=3D"nofollow" shape=3D"rect" tar=
get=3D"_blank" href=3D"https://gist.github.com/mheffner/dc44613620b25c4fa46=
d">https://gist.github.com/mheffner/dc44613620b25c4fa46d</a><br clear=3D"no=
ne">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; We've used several of the sysctl settings from Al's guide =
as well:<br clear=3D"none">
&gt;&gt;&gt;&gt; <a rel=3D"nofollow" shape=3D"rect" target=3D"_blank" href=
=3D"https://gist.github.com/mheffner/ea40d58f58a517028152">https://gist.git=
hub.com/mheffner/ea40d58f58a517028152</a><br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Our client application is able to write using either Thrif=
t batches<br clear=3D"none">
&gt;&gt;&gt;&gt; using Asytanax driver or CQL async INSERT's using the Data=
stax Java driver.<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; For testing against Thrift (our legacy infra uses this) we=
 write batches<br clear=3D"none">
&gt;&gt;&gt;&gt; of anywhere from 6 to 1500 rows at a time. Our p99 for bat=
ch execution is<br clear=3D"none">
&gt;&gt;&gt;&gt; around 45ms but our maximum (p100) sits less than 150ms ex=
cept when it<br clear=3D"none">
&gt;&gt;&gt;&gt; periodically spikes to the full 10seconds.<br clear=3D"non=
e">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Testing the same write path using CQL writes instead demon=
strates<br clear=3D"none">
&gt;&gt;&gt;&gt; similar behavior. Low p99s except for periodic full timeou=
ts. We enabled<br clear=3D"none">
&gt;&gt;&gt;&gt; tracing for several operations but were unable to get a tr=
ace that completed<br clear=3D"none">
&gt;&gt;&gt;&gt; successfully -- Cassandra started logging many messages as=
:<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; INFO&nbsp; [ScheduledTasks:1] - MessagingService.java:946 =
- _TRACE messages<br clear=3D"none">
&gt;&gt;&gt;&gt; were dropped in last 5000 ms: 52499 for internal timeout a=
nd 0 for cross<br clear=3D"none">
&gt;&gt;&gt;&gt; node timeout<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; And all the traces contained rows with a "null" source_ela=
psed row:<br clear=3D"none">
&gt;&gt;&gt;&gt; <a rel=3D"nofollow" shape=3D"rect" target=3D"_blank" href=
=3D"https://gist.githubusercontent.com/mheffner/1d68a70449bd6688a010/raw/03=
27d7d3d94c3a93af02b64212e3b7e7d8f2911b/trace.out">https://gist.githubuserco=
ntent.com/mheffner/1d68a70449bd6688a010/raw/0327d7d3d94c3a93af02b64212e3b7e=
7d8f2911b/trace.out</a><br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; We've exhausted as many configuration option permutations =
that we can<br clear=3D"none">
&gt;&gt;&gt;&gt; think of. This cluster does not appear to be under any sig=
nificant load and<br clear=3D"none">
&gt;&gt;&gt;&gt; latencies seem to largely fall in two bands: low normal or=
 max timeout. This<br clear=3D"none">
&gt;&gt;&gt;&gt; seems to imply that something is getting stuck and timing =
out at the max<br clear=3D"none">
&gt;&gt;&gt;&gt; write timeout.<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Any suggestions on what to look for? We had debug enabled =
for awhile but<br clear=3D"none">
&gt;&gt;&gt;&gt; we didn't see any msg that pointed to something obvious. H=
appy to provide<br clear=3D"none">
&gt;&gt;&gt;&gt; any more information that may help.<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; We are pretty much at the point of sprinkling debug around=
 the code to<br clear=3D"none">
&gt;&gt;&gt;&gt; track down what could be blocking.<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Thanks,<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; Mike<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt; --<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt;&nbsp; &nbsp;Mike Heffner &lt;<a rel=3D"nofollow" shape=3D"=
rect" ymailto=3D"mailto:mike@librato.com" target=3D"_blank" href=3D"javascr=
ipt:return">mike@librato.com</a>&gt;<br clear=3D"none">
&gt;&gt;&gt;&gt;&nbsp; &nbsp;Librato, Inc.<br clear=3D"none">
&gt;&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;&gt;<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt; --<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;&gt;&nbsp; &nbsp;Mike Heffner &lt;<a rel=3D"nofollow" shape=3D"rect" ym=
ailto=3D"mailto:mike@librato.com" target=3D"_blank" href=3D"javascript:retu=
rn">mike@librato.com</a>&gt;<br clear=3D"none">
&gt;&gt;&nbsp; &nbsp;Librato, Inc.<br clear=3D"none">
&gt;&gt;<br clear=3D"none">
&gt;<br clear=3D"none">
&gt;<br clear=3D"none">
&gt;<br clear=3D"none">
&gt; --<br clear=3D"none">
&gt;<br clear=3D"none">
&gt;&nbsp; &nbsp;Mike Heffner &lt;<a rel=3D"nofollow" shape=3D"rect" ymailt=
o=3D"mailto:mike@librato.com" target=3D"_blank" href=3D"javascript:return">=
mike@librato.com</a>&gt;<br clear=3D"none">
&gt;&nbsp; &nbsp;Librato, Inc.<br clear=3D"none">
&gt;<br clear=3D"none">
<br clear=3D"none">
<br clear=3D"none">
<br clear=3D"none">
</div></div><span><font color=3D"#888888">--<br clear=3D"none">
Close the World, Open the Net<br clear=3D"none">
<a rel=3D"nofollow" shape=3D"rect" target=3D"_blank" href=3D"http://www.lin=
ux-wizard.net">http://www.linux-wizard.net</a><br clear=3D"none">
</font></span></blockquote></div><br clear=3D"none"></div>
</div></div></blockquote></div><br clear=3D"none"><br clear=3D"all"><div><b=
r clear=3D"none"></div>-- <br clear=3D"none"><div><div><br clear=3D"none">&=
nbsp;&nbsp;Mike Heffner &lt;<a rel=3D"nofollow" shape=3D"rect" ymailto=3D"m=
ailto:mike@librato.com" target=3D"_blank" href=3D"javascript:return">mike@l=
ibrato.com</a>&gt;</div><div>&nbsp;&nbsp;Librato, Inc.</div><div><br clear=
=3D"none"></div></div>
</div>
</div></div></blockquote></div><br clear=3D"none"></div>
</div></div></blockquote></div></div><br clear=3D"none"><br clear=3D"all"><=
div><br clear=3D"none"></div>-- <br clear=3D"none"><div class=3D"gmail_sign=
ature"><div><br clear=3D"none">&nbsp;&nbsp;Mike Heffner &lt;<a rel=3D"nofol=
low" shape=3D"rect" ymailto=3D"mailto:mike@librato.com" target=3D"_blank" h=
ref=3D"javascript:return">mike@librato.com</a>&gt;</div><div>&nbsp;&nbsp;Li=
brato, Inc.</div><div><br clear=3D"none"></div></div>
</div> </div> </blockquote></div>
------=_Part_5253988_1618601814.1455816157526--