Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1FC8511062 for ; Thu, 24 Jul 2014 20:08:08 +0000 (UTC) Received: (qmail 87873 invoked by uid 500); 24 Jul 2014 20:08:06 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 87839 invoked by uid 500); 24 Jul 2014 20:08:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 87829 invoked by uid 99); 24 Jul 2014 20:08:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jul 2014 20:08:05 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of SRS0=l14YJq=4T=basetechnology.com=jack@yourhostingaccount.com designates 65.254.253.36 as permitted sender) Received: from [65.254.253.36] (HELO walmailout04.yourhostingaccount.com) (65.254.253.36) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jul 2014 20:08:01 +0000 Received: from mailscan02.yourhostingaccount.com ([10.1.15.2] helo=walmailscan02.yourhostingaccount.com) by walmailout04.yourhostingaccount.com with esmtp (Exim) id 1XAPIi-0007ff-MI for user@cassandra.apache.org; Thu, 24 Jul 2014 16:07:40 -0400 Received: from impout01.yourhostingaccount.com ([10.1.55.1] helo=impout01.yourhostingaccount.com) by walmailscan02.yourhostingaccount.com with esmtp (Exim) id 1XAPIi-0005eK-Kl for user@cassandra.apache.org; Thu, 24 Jul 2014 16:07:40 -0400 Received: from walauthsmtp13.yourhostingaccount.com ([10.1.18.13]) by impout01.yourhostingaccount.com with NO UCE id WL7g1o0080GvC8i01L7gMU; Thu, 24 Jul 2014 16:07:40 -0400 X-Authority-Analysis: v=2.0 cv=Ir+cgcDg c=1 sm=1 a=UkMH5KcvGpXfM81wB0t8ug==:17 a=aQzbgH187woA:10 a=3jZET7lWBKwA:10 a=jvYhGVW7AAAA:8 a=mV9VRH-2AAAA:8 a=NEAV23lmAAAA:8 a=7WIC4EcEAAAA:8 a=pGLkceISAAAA:8 a=dVrGkyTA6MSIa2NcaHAA:9 a=QEXdDO2ut3YA:10 a=zxXLMNteaUYA:10 a=MSl-tDqOz04A:10 a=M6Kqmjw4cQ4A:10 a=hFzN7bwPdtMG5206:21 a=zQ_gE_JDhsHWDUpX:21 a=TrGPD240lMaIbczpGHsA:9 a=_W_S_7VecoQA:10 a=tXsnliwV7b4A:10 a=3Ib9-LrrmV7ct3Px:21 a=ZX4JA3Ol9pPlCaql:21 a=98qOcNyB16KL63Z3hZrGug==:117 X-EN-OrigOutIP: 10.1.18.13 X-EN-IMPSID: WL7g1o0080GvC8i01L7gMU Received: from 207-237-113-28.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com ([207.237.113.28]:3553 helo=JackKrupansky14) by walauthsmtp13.yourhostingaccount.com with esmtpa (Exim) id 1XAPIi-0005dV-9Y for user@cassandra.apache.org; Thu, 24 Jul 2014 16:07:40 -0400 Message-ID: <12FEED3390544C37AA836CD3F82F6A4C@JackKrupansky14> From: "Jack Krupansky" To: References: In-Reply-To: Subject: Re: Hot, large row Date: Thu, 24 Jul 2014 16:07:41 -0400 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_3A88_01CFA759.65CC27C0" X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 16.4.3528.331 X-MimeOLE: Produced By Microsoft MimeOLE V16.4.3528.331 X-EN-UserInfo: e0a4b55451ed9f27313ebf02e3d4348d:931c98230c6409dcc37fa7e93b490c27 X-EN-AuthUser: jack@basetechnology.com Sender: "Jack Krupansky" X-EN-OrigIP: 207.237.113.28 X-EN-OrigHost: 207-237-113-28.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. ------=_NextPart_000_3A88_01CFA759.65CC27C0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Could it be some =E2=80=9Cfat columns=E2=80=9D (cells with large blob or = text values) rather than the cell-count per se? IOW, a =E2=80=9Cbig = row=E2=80=9D rather than a =E2=80=9Cwide row=E2=80=9D? And, could it be a large partition rather than a large row (many rows in = a single partition)? Are clustering columns being used in the primary = key? -- Jack Krupansky From: DuyHai Doan=20 Sent: Thursday, July 24, 2014 3:53 PM To: user@cassandra.apache.org=20 Subject: Re: Hot, large row Your extract of cfhistograms show that there are no particular "wide = rows". The widest has 61214 cells which is big but not that huge to be = really a concern. Turning on trace probabilty only tells give you some "hints" about what = kind of queries are done, it does not give the exact partition key nor = other statement values, especially when you are using prepared = statements ... "I am considering reducing read_request_timeout_in_ms: 5000 in = cassandra.yaml so that it reduces the impact when this occurs." --> = Don't do that, you'll only sweep dust under the carpet. Find the real = issue and fix it instead of changing parameter to hide it. One solution would be on client side, to activate some logging to show = the CQL3 statements the application is issuing that may overload the = server. I know that's better said than done but I don't have any other = idea for the moment -------- Shameless self-promotion ------ To support this kind of live prod debugging & investigation that I add a = new dynamic query logging feature in Achilles: = https://github.com/doanduyhai/Achilles/wiki/Statements-Logging-and-Tracin= g#dynamic-statements-logging Once you hit the issue, this kind of feature may save your day... On Thu, Jul 24, 2014 at 9:22 PM, Keith Wright = wrote: I can see from cfhistograms that I do have some wide rows (see below). = I set trace probability as you suggested but the output doesn=E2=80=99t = appear to tell me what row was actually read unless I missed something. = I just see executing prepared statement. Any ideas how I can find the = row in question? I am considering reducing read_request_timeout_in_ms: 5000 in = cassandra.yaml so that it reduces the impact when this occurs. Any help in identifying my issue would be GREATLY appreciated Cell Count per Partition 1 cells: 50449950 2 cells: 14281828 3 cells: 8093366 4 cells: 5029200 5 cells: 3103023 6 cells: 3059903 7 cells: 1903018 8 cells: 1509297 10 cells: 2420359 12 cells: 1624895 14 cells: 1171678 17 cells: 1289391 20 cells: 909777 24 cells: 852081 29 cells: 722925 35 cells: 587067 42 cells: 459473 50 cells: 358744 60 cells: 304146 72 cells: 244682 86 cells: 191045 103 cells: 155337 124 cells: 127061 149 cells: 98913 179 cells: 77454 215 cells: 59849 258 cells: 46117 310 cells: 35321 372 cells: 26319 446 cells: 19379 535 cells: 13783 642 cells: 9993 770 cells: 6973 924 cells: 4713 1109 cells: 3229 1331 cells: 2062 1597 cells: 1338 1916 cells: 773 2299 cells: 495 2759 cells: 268 3311 cells: 150 3973 cells: 100 4768 cells: 42 5722 cells: 24 6866 cells: 12 8239 cells: 9 9887 cells: 3 11864 cells: 0 14237 cells: 5 17084 cells: 1 20501 cells: 0 24601 cells: 2 29521 cells: 0 35425 cells: 0 42510 cells: 0 51012 cells: 0 61214 cells: 2 From: DuyHai Doan Reply-To: "user@cassandra.apache.org" Date: Thursday, July 24, 2014 at 3:01 PM To: "user@cassandra.apache.org" Subject: Re: Hot, large row "How can I detect wide rows?" --> nodetool cfhistograms Look at column "Column count" (last column) and identify a line in = this column with very high value of "Offset". In a well designed = application you should have a gaussian distribution where 80% of your = row have a similar number of columns. "Anyone know what debug level I can set so that I can see what reads = the hot node is handling? " --> "nodetool settraceprobability ", where value is a small number = (0.001) on the node where you encounter the issue. Activate the tracing = for a while (5 mins) and deactivate it (value =3D 0). Then look into = system_traces tables "events" & "sessions". It may help or not since the = tracing is done once every 1000.=20 "Any way to get the server to blacklist these wide rows = automatically?" --> No On Thu, Jul 24, 2014 at 8:48 PM, Keith Wright = wrote: Hi all, We are seeing an issue where basically daily one of our nodes = spikes in load and is churning in CMS heap pressure. It appears that = reads are backing up and my guess is that our application is reading a = large row repeatedly. Our write structure can lead itself to wide rows = very infrequently (<0.001%) and we do our best to detect and delete them = but obviously we=E2=80=99re missing a case. Hoping for assistance on = the following questions: a.. How can I detect wide rows?=20 b.. Anyone know what debug level I can set so that I can see what = reads the hot node is handling? I=E2=80=99m hoping to see the = =E2=80=9Cbad=E2=80=9D row=20 c.. Any way to get the server to blacklist these wide rows = automatically?=20 We=E2=80=99re using C* 2.0.6 with Vnodes. Thanks ------=_NextPart_000_3A88_01CFA759.65CC27C0 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
Could it be some =E2=80=9Cfat columns=E2=80=9D (cells with large = blob or text values)=20 rather than the cell-count per se? IOW, a =E2=80=9Cbig row=E2=80=9D = rather than a =E2=80=9Cwide=20 row=E2=80=9D?
 
And, could it be a large partition rather than a large row (many = rows in a=20 single partition)? Are clustering columns being used in the primary = key?
 
-- Jack=20 Krupansky
 
Sent: Thursday, July 24, 2014 3:53 PM
Subject: Re: Hot, large row
 
Your extract of cfhistograms show that there are no particular = "wide rows".=20 The widest has 61214 cells which is big but not that huge to be really a = concern.

Turning on trace probabilty only tells give you = some=20 "hints" about what kind of queries are done, it does not give the exact=20 partition key nor other statement values, especially when you are using = prepared=20 statements ...


"I am considering reducing = read_request_timeout_in_ms:=20 5000 in cassandra.yaml so that it reduces the impact when this occurs." = -->=20 Don't do that, you'll only sweep dust under the carpet. Find the real = issue and=20 fix it instead of changing parameter to hide it.

One = solution would=20 be on client side, to activate some logging to show the CQL3 statements = the=20 application is issuing that may overload the server.  I know that's = better=20 said than done but I don't have any other idea for the = moment

-------- Shameless self-promotion ------

To support this kind of live prod debugging & investigation = that I add=20 a new dynamic query logging feature in Achilles: https://github.com/doanduyhai/Achil= les/wiki/Statements-Logging-and-Tracing#dynamic-statements-logging
Once you hit the issue, this kind of feature may save your = day...
 
 




On Thu, Jul 24, 2014 at 9:22 PM, Keith Wright = <kwright@nanigans.com> wrote:
I can see from cfhistograms that I do have some wide rows (see=20 below).  I set trace probability as you suggested but the output = doesn=E2=80=99t=20 appear to tell me what row was actually read unless I missed = something. =20 I just see executing prepared statement.   Any ideas how I = can find=20 the row in question?
 
I am considering reducing read_request_timeout_in_ms: 5000 in=20 cassandra.yaml so that it reduces the impact when this occurs.
 
Any help in identifying my issue would be GREATLY = appreciated
 

Cell = Count per=20 Partition

   =20 1 cells: 50449950

   =20 2 cells: 14281828

   =20 3 cells: 8093366

   =20 4 cells: 5029200

   =20 5 cells: 3103023

   =20 6 cells: 3059903

   =20 7 cells: 1903018

   =20 8 cells: 1509297

   10=20 cells: 2420359

   12=20 cells: 1624895

   14=20 cells: 1171678

   17=20 cells: 1289391

   20=20 cells: 909777

   24=20 cells: 852081

   29=20 cells: 722925

   35=20 cells: 587067

   42=20 cells: 459473

   50=20 cells: 358744

   60=20 cells: 304146

   72=20 cells: 244682

   86=20 cells: 191045

  = 103 cells:=20 155337

  = 124 cells:=20 127061

  = 149 cells:=20 98913

  = 179 cells:=20 77454

  = 215 cells:=20 59849

  = 258 cells:=20 46117

  = 310 cells:=20 35321

  = 372 cells:=20 26319

  = 446 cells:=20 19379

  = 535 cells:=20 13783

  = 642 cells:=20 9993

  = 770 cells:=20 6973

  = 924 cells:=20 4713

1109 = cells:=20 3229

1331 = cells:=20 2062

1597 = cells:=20 1338

1916 = cells:=20 773

2299 = cells:=20 495

2759 = cells:=20 268

3311 = cells:=20 150

3973 = cells:=20 100

4768 = cells: 42

5722 = cells: 24

6866 = cells: 12

8239 = cells: 9

9887 = cells: 3

11864 = cells: 0

14237 = cells: 5

17084 = cells: 1

20501 = cells: 0

24601 = cells: 2

29521 = cells: 0

35425 = cells: 0

42510 = cells: 0

51012 = cells: 0

61214 = cells:=20 2

 
From: DuyHai Doan <doanduyhai@gmail.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Thursday, July 24, 2014 at = 3:01=20 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Hot, large = row
 
"How can I detect wide rows?" -->

nodetool = cfhistograms=20 <keyspace> <suspected column family>

Look at = column=20 "Column count" (last column) and identify a line in this column with = very high=20 value of "Offset". In a well designed application you should have a = gaussian=20 distribution where 80% of your row have a similar number of = columns.

"Anyone know what debug level I can set so that I can see = what reads=20 the hot node is handling?  " -->

"nodetool settraceprobability <value>",  where value = is a=20 small number (0.001) on the node where you encounter the issue. = Activate the=20 tracing for a while (5 mins) and deactivate it (value =3D 0). Then = look into=20 system_traces tables "events" & "sessions". It may help or not = since the=20 tracing is done once every 1000.

"Any way to get the server to blacklist these wide rows=20 automatically?" --> No


On Thu, Jul 24, 2014 at 8:48 PM, Keith Wright = <kwright@nanigans.com> wrote:
Hi all,
 
   We are seeing an issue where basically daily one = of our=20 nodes spikes in load and is churning in CMS heap pressure.  It = appears=20 that reads are backing up and my guess is that our application is = reading a=20 large row repeatedly.  Our write structure can lead itself to = wide rows=20 very infrequently (<0.001%) and we do our best to detect and = delete them=20 but obviously we=E2=80=99re missing a case.  Hoping for = assistance on the=20 following questions:
  • How can I detect wide rows?=20
  • Anyone know what debug level I can set so that I can see what = reads=20 the hot node is handling?  I=E2=80=99m hoping to see the = =E2=80=9Cbad=E2=80=9D row=20
  • Any way to get the server to blacklist these wide rows = automatically?=20
We=E2=80=99re using C* 2.0.6 with Vnodes.
 
Thanks
 
 
------=_NextPart_000_3A88_01CFA759.65CC27C0--