Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4869818D05 for ; Fri, 12 Feb 2016 17:24:29 +0000 (UTC) Received: (qmail 4262 invoked by uid 500); 12 Feb 2016 17:24:25 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 4224 invoked by uid 500); 12 Feb 2016 17:24:25 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 4214 invoked by uid 99); 12 Feb 2016 17:24:25 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Feb 2016 17:24:25 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 13799C1DEB for ; Fri, 12 Feb 2016 17:24:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.28 X-Spam-Level: * X-Spam-Status: No, score=1.28 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=anguenot-org.20150623.gappssmtp.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id G63VhARWKRbN for ; Fri, 12 Feb 2016 17:24:22 +0000 (UTC) Received: from mail-ob0-f196.google.com (mail-ob0-f196.google.com [209.85.214.196]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 7184E4380B for ; Fri, 12 Feb 2016 17:24:22 +0000 (UTC) Received: by mail-ob0-f196.google.com with SMTP id wg8so9481910obc.3 for ; Fri, 12 Feb 2016 09:24:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anguenot-org.20150623.gappssmtp.com; s=20150623; h=from:content-type:message-id:mime-version:subject:date:references :to:in-reply-to; bh=iMsoA0Un5p2+IvWiwLTVIIkyZiTAmOxUmXbMR91iWM4=; b=Ddq4WBj70Cx16mcsiGruibK2e87uHNTghU53XPtA4YJKeZm7UbZYEtdcl23zI9sZCv XbrZk+KPA4VJJaWgOmdiFZv6bCU/b4k3fCezFbhOuo6s4dM1O1p4iHSr0IOefkW/FpVq 07bNU2pnsnIt5kspqGAZK/xqTm6fkU+vgsWXiNCWp4tf02SiSwOxgOLaQ3XjFvIT7iaH VJLt93eyT/W3cSHuLzz6J45HFs8l3YQCBCQzy/XWXTnfQ49dlyhSdDGWg0RpmEl/H7As lEa+1FZ3FIMMhiPgzRYTwifFplIlX9nuZcAbT6KFYSQ5wBTCJUqkcQqG9uA82PNgbKxo kGXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; bh=iMsoA0Un5p2+IvWiwLTVIIkyZiTAmOxUmXbMR91iWM4=; b=eE7tSg/yCQB9TdR0C5qy59U9j7wFvdCI5W1NBzcwtT9ZmwAAjX2zpIoPNRgJYL006F YzYSnTz6I6Gtbc1Q9gxel2habpTvlt3sCJNo0h4g+ZsiCozurT2txAHaAcFPYFwUxRM3 UcHICGIUISBuuQXUPuTcPiXbVBVfUVXSToczjE7035ty8aGBEMSyMuJmxrRYU9miETSg yCM159avMMBm9zs963nRS0ASBDgQ+lsBiziZzKFtVBScV+zSYRFy0JUQVKhoIFiZhSsP OF62JvhmDVVH93XKxX7daPCC3WNBTelKmqKkQoUyNn3OyyXBNGq5k6gNri8oM96s9aqM vG1A== X-Gm-Message-State: AG10YOT/1Vhuj8mvDiEeE2v8lqKCO6vNisW3s9IvLBVCSWdlk5+Z7E2xJaelsortScqttg== X-Received: by 10.182.33.166 with SMTP id s6mr2252549obi.30.1455297856030; Fri, 12 Feb 2016 09:24:16 -0800 (PST) Received: from [10.0.1.3] (cpe-66-60-236-133.cmts1.phonoscopecable.net. [66.60.236.133]) by smtp.gmail.com with ESMTPSA id gi5sm9089068obb.6.2016.02.12.09.24.14 for (version=TLSv1/SSLv3 cipher=OTHER); Fri, 12 Feb 2016 09:24:15 -0800 (PST) From: Julien Anguenot Content-Type: multipart/alternative; boundary="Apple-Mail=_A3720B53-570D-47FE-80CB-4580CD73185B" Message-Id: <233D643E-F1C7-4639-A3EE-C35C2CB98073@anguenot.org> Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) Subject: Re: Cassandra eats all cpu cores, high load average Date: Fri, 12 Feb 2016 11:24:14 -0600 References: <76FEA31B-37F9-4C3C-9CD0-5BE239D5C25C@skvazh.com> <3B833086-8F42-4F58-9ED6-2AD2255F1833@skvazh.com> <0ECE847E-45A0-4DDF-838F-207EF17477EF@skvazh.com> <09BBF630-71E1-4722-9A0B-571A91DEE03A@anguenot.org> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.3112) --Apple-Mail=_A3720B53-570D-47FE-80CB-4580CD73185B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 If you positive this is not compaction related I would: 1. check disk IOPs and latency on the EBS volume. (dstat) 2. turn GC log on in cassandra-env.sh and use jstat to see what is = happening to your HEAP. I have been asking about compactions initially because if you having one = (1) big table written by all nodes and fully replicated to all nodes in = the cluster would definitely trigger constant compactions on every nodes = depending on write throughput. J.=20 > On Feb 12, 2016, at 11:03 AM, Skvazh Roman wrote: >=20 >> Does the load decrease and the node answers requests =E2=80=9Cnormally=E2= =80=9D when you do disable auto-compaction? You actually see pending = compactions on nodes having high load correct? >=20 > Nope. >=20 >> All seems legit here. Using G1 GC? > Yes >=20 > Problems also occurred on nodes without pending compactions. >=20 >=20 >=20 >> On 12 Feb 2016, at 18:44, Julien Anguenot > wrote: >>=20 >>>=20 >>> On Feb 12, 2016, at 9:24 AM, Skvazh Roman > wrote: >>>=20 >>> I have disabled autocompaction and stop it on highload node. >>=20 >> Does the load decrease and the node answers requests =E2=80=9Cnormally=E2= =80=9D when you do disable auto-compaction? You actually see pending = compactions on nodes having high load correct? >>=20 >>> Heap is 8Gb. gc_grace is 86400 >>> All sstables is about 200-300 Mb. >>=20 >> All seems legit here. Using G1 GC? >>=20 >>> $ nodetool compactionstats >>> pending tasks: 14 >>=20 >> Try to increase the compactors from 4 to 6-8 on a node, disable = gossip and let it finish compacting and put it back in the ring by = enabling gossip. See what happens. >>=20 >> The tombstones count growing is because the auto-aucompactions are = disabled on these nodes. Probably not your issue. >>=20 >> J. >>=20 >>=20 >>>=20 >>> $ dstat -lvnr 10 >>> ---load-avg--- ---procs--- ------memory-usage----- ---paging-- = -dsk/total- ---system-- ----total-cpu-usage---- -net/total- --io/total- >>> 1m 5m 15m |run blk new| used buff cach free| in out | read = writ| int csw |usr sys idl wai hiq siq| recv send| read writ >>> 29.4 28.6 23.5|0.0 0 1.2|11.3G 190M 17.6G 407M| 0 0 |7507k = 7330k| 13k 40k| 11 1 88 0 0 0| 0 0 |96.5 64.6 >>> 29.3 28.6 23.5| 29 0 0.9|11.3G 190M 17.6G 408M| 0 0 | 0 = 189k|9822 2319 | 99 0 0 0 0 0| 138k 120k| 0 4.30 >>> 29.4 28.6 23.6| 30 0 2.0|11.3G 190M 17.6G 408M| 0 0 | 0 = 26k|8689 2189 |100 0 0 0 0 0| 139k 120k| 0 2.70 >>> 29.4 28.7 23.6| 29 0 3.0|11.3G 190M 17.6G 408M| 0 0 | 0 = 20k|8722 1846 | 99 0 0 0 0 0| 136k 120k| 0 1.50 ^C >>>=20 >>>=20 >>> JvmTop 0.8.0 alpha - 15:20:37, amd64, 16 cpus, Linux 3.14.44-3, = load avg 28.09 >>> http://code.google.com/p/jvmtop >>>=20 >>> PID 32505: org.apache.cassandra.service.CassandraDaemon >>> ARGS: >>> VMARGS: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar = -XX:+CMSCl[...] >>> VM: Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 1.8.0_65 >>> UP: 8:31m #THR: 334 #THRPEAK: 437 #THRCREATED: 4694 USER: = cassandra >>> GC-Time: 0: 8m #GC-Runs: 6378 #TotalLoadedClasses: 5926 >>> CPU: 97.96% GC: 0.00% HEAP:6049m /7540m NONHEAP: 82m / n/a >>>=20 >>> TID NAME STATE CPU = TOTALCPU BLOCKEDBY >>> 447 SharedPool-Worker-45 RUNNABLE 60.47% = 1.03% >>> 343 SharedPool-Worker-2 RUNNABLE 56.46% = 3.07% >>> 349 SharedPool-Worker-8 RUNNABLE 56.43% = 1.61% >>> 456 SharedPool-Worker-25 RUNNABLE 55.25% = 1.06% >>> 483 SharedPool-Worker-40 RUNNABLE 53.06% = 1.04% >>> 475 SharedPool-Worker-53 RUNNABLE 52.31% = 1.03% >>> 464 SharedPool-Worker-20 RUNNABLE 52.00% = 1.11% >>> 577 SharedPool-Worker-71 RUNNABLE 51.73% = 1.02% >>> 404 SharedPool-Worker-10 RUNNABLE 51.10% = 1.29% >>> 486 SharedPool-Worker-34 RUNNABLE 51.06% = 1.03% >>> Note: Only top 10 threads (according cpu load) are shown! >>>=20 >>>=20 >>>> On 12 Feb 2016, at 18:14, Julien Anguenot > wrote: >>>>=20 >>>> At the time when the load is high and you have to restart, do you = see any pending compactions when using `nodetool compactionstats`? >>>>=20 >>>> Possible to see a `nodetool compactionstats` taken *when* the load = is too high? Have you checked the size of your SSTables for that big = table? Any large ones in there? What about the Java HEAP configuration = on these nodes? >>>>=20 >>>> If you have too many tombstones I would try to decrease = gc_grace_seconds so they get cleared out earlier during compactions. >>>>=20 >>>> J. >>>>=20 >>>>> On Feb 12, 2016, at 8:45 AM, Skvazh Roman > wrote: >>>>>=20 >>>>> There is 1-4 compactions at that moment. >>>>> We have many tombstones, which does not removed. >>>>> DroppableTombstoneRatio is 5-6 (greater than 1) >>>>>=20 >>>>>> On 12 Feb 2016, at 15:53, Julien Anguenot > wrote: >>>>>>=20 >>>>>> Hey,=20 >>>>>>=20 >>>>>> What about compactions count when that is happening? >>>>>>=20 >>>>>> J. >>>>>>=20 >>>>>>=20 >>>>>>> On Feb 12, 2016, at 3:06 AM, Skvazh Roman > wrote: >>>>>>>=20 >>>>>>> Hello! >>>>>>> We have a cluster of 25 c3.4xlarge nodes (16 cores, 32 GiB) with = attached 1.5 TB 4000 PIOPS EBS drive. >>>>>>> Sometimes one or two nodes user cpu spikes to 100%, load average = to 20-30 - read requests drops of. >>>>>>> Only restart of this cassandra services helps. >>>>>>> Please advice. >>>>>>>=20 >>>>>>> One big table with wide rows. 600 Gb per node. >>>>>>> LZ4Compressor >>>>>>> LeveledCompaction >>>>>>>=20 >>>>>>> concurrent compactors: 4 >>>>>>> compactor throughput: tried from 16 to 128 >>>>>>> Concurrent_readers: from 16 to 32 >>>>>>> Concurrent_writers: 128 >>>>>>>=20 >>>>>>>=20 >>>>>>> https://gist.github.com/rskvazh/de916327779b98a437a6 = >>>>>>>=20 >>>>>>>=20 >>>>>>> JvmTop 0.8.0 alpha - 06:51:10, amd64, 16 cpus, Linux 3.14.44-3, = load avg 19.35 >>>>>>> http://code.google.com/p/jvmtop = >>>>>>>=20 >>>>>>> Profiling PID 9256: org.apache.cassandra.service.CassandraDa >>>>>>>=20 >>>>>>> 95.73% ( 4.31s) = ....google.common.collect.AbstractIterator.tryToComputeN() >>>>>>> 1.39% ( 0.06s) com.google.common.base.Objects.hashCode() >>>>>>> 1.26% ( 0.06s) io.netty.channel.epoll.Native.epollWait() >>>>>>> 0.85% ( 0.04s) = net.jpountz.lz4.LZ4JNI.LZ4_compress_limitedOutput() >>>>>>> 0.46% ( 0.02s) net.jpountz.lz4.LZ4JNI.LZ4_decompress_fast() >>>>>>> 0.26% ( 0.01s) = com.google.common.collect.Iterators$7.computeNext() >>>>>>> 0.06% ( 0.00s) io.netty.channel.epoll.Native.eventFdWrite() >>>>>>>=20 >>>>>>>=20 >>>>>>> ttop: >>>>>>>=20 >>>>>>> 2016-02-12T08:20:25.605+0000 Process summary >>>>>>> process cpu=3D1565.15% >>>>>>> application cpu=3D1314.48% (user=3D1354.48% sys=3D-40.00%) >>>>>>> other: cpu=3D250.67% >>>>>>> heap allocation rate 146mb/s >>>>>>> [000405] user=3D76.25% sys=3D-0.54% alloc=3D 0b/s - = SharedPool-Worker-9 >>>>>>> [000457] user=3D75.54% sys=3D-1.26% alloc=3D 0b/s - = SharedPool-Worker-14 >>>>>>> [000451] user=3D73.52% sys=3D 0.29% alloc=3D 0b/s - = SharedPool-Worker-16 >>>>>>> [000311] user=3D76.45% sys=3D-2.99% alloc=3D 0b/s - = SharedPool-Worker-4 >>>>>>> [000389] user=3D70.69% sys=3D 2.62% alloc=3D 0b/s - = SharedPool-Worker-6 >>>>>>> [000388] user=3D86.95% sys=3D-14.28% alloc=3D 0b/s - = SharedPool-Worker-5 >>>>>>> [000404] user=3D70.69% sys=3D 0.10% alloc=3D 0b/s - = SharedPool-Worker-8 >>>>>>> [000390] user=3D72.61% sys=3D-1.82% alloc=3D 0b/s - = SharedPool-Worker-7 >>>>>>> [000255] user=3D87.86% sys=3D-17.87% alloc=3D 0b/s - = SharedPool-Worker-1 >>>>>>> [000444] user=3D72.21% sys=3D-2.30% alloc=3D 0b/s - = SharedPool-Worker-12 >>>>>>> [000310] user=3D71.50% sys=3D-2.31% alloc=3D 0b/s - = SharedPool-Worker-3 >>>>>>> [000445] user=3D69.68% sys=3D-0.83% alloc=3D 0b/s - = SharedPool-Worker-13 >>>>>>> [000406] user=3D72.61% sys=3D-4.40% alloc=3D 0b/s - = SharedPool-Worker-10 >>>>>>> [000446] user=3D69.78% sys=3D-1.65% alloc=3D 0b/s - = SharedPool-Worker-11 >>>>>>> [000452] user=3D66.86% sys=3D 0.22% alloc=3D 0b/s - = SharedPool-Worker-15 >>>>>>> [000256] user=3D69.08% sys=3D-2.42% alloc=3D 0b/s - = SharedPool-Worker-2 >>>>>>> [004496] user=3D29.99% sys=3D 0.59% alloc=3D 30mb/s - = CompactionExecutor:15 >>>>>>> [004906] user=3D29.49% sys=3D 0.74% alloc=3D 39mb/s - = CompactionExecutor:16 >>>>>>> [010143] user=3D28.58% sys=3D 0.25% alloc=3D 26mb/s - = CompactionExecutor:17 >>>>>>> [000785] user=3D27.87% sys=3D 0.70% alloc=3D 38mb/s - = CompactionExecutor:12 >>>>>>> [012723] user=3D 9.09% sys=3D 2.46% alloc=3D 2977kb/s - RMI TCP = Connection(2673)-127.0.0.1 >>>>>>> [000555] user=3D 5.35% sys=3D-0.08% alloc=3D 474kb/s - = SharedPool-Worker-24 >>>>>>> [000560] user=3D 3.94% sys=3D 0.07% alloc=3D 434kb/s - = SharedPool-Worker-22 >>>>>>> [000557] user=3D 3.94% sys=3D-0.17% alloc=3D 339kb/s - = SharedPool-Worker-25 >>>>>>> [000447] user=3D 2.73% sys=3D 0.60% alloc=3D 436kb/s - = SharedPool-Worker-19 >>>>>>> [000563] user=3D 3.33% sys=3D-0.04% alloc=3D 460kb/s - = SharedPool-Worker-20 >>>>>>> [000448] user=3D 2.73% sys=3D 0.27% alloc=3D 414kb/s - = SharedPool-Worker-21 >>>>>>> [000554] user=3D 1.72% sys=3D 0.70% alloc=3D 232kb/s - = SharedPool-Worker-26 >>>>>>> [000558] user=3D 1.41% sys=3D 0.39% alloc=3D 213kb/s - = SharedPool-Worker-23 >>>>>>> [000450] user=3D 1.41% sys=3D-0.03% alloc=3D 158kb/s - = SharedPool-Worker-17 >>>>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>=20 >>=20 >> -- >> Julien Anguenot (@anguenot) >> USA +1.832.408.0344 =20 >> FR +33.7.86.85.70.44 --Apple-Mail=_A3720B53-570D-47FE-80CB-4580CD73185B Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 If you positive this is not compaction related I would:

   1. check = disk IOPs and latency on the EBS volume. (dstat)
   2. turn GC log on in cassandra-env.sh  and = use jstat to see what is happening to your HEAP.

I have been asking about compactions = initially because if you having one (1) big table written by all nodes = and fully replicated to all nodes in the cluster would definitely = trigger constant compactions on every nodes depending on write = throughput.

   J. 

On = Feb 12, 2016, at 11:03 AM, Skvazh Roman <r@skvazh.com> = wrote:

Does the load decrease and the node answers requests = =E2=80=9Cnormally=E2=80=9D when you do disable auto-compaction? You = actually see pending compactions on nodes having high load = correct?
Nope.

All seems legit here. Using G1 = GC?
Yes

Problems also occurred on nodes without pending = compactions.



On 12 Feb 2016, at 18:44, = Julien Anguenot <julien@anguenot.org> wrote:


On = Feb 12, 2016, at 9:24 AM, Skvazh Roman <r@skvazh.com> wrote:

I = have disabled autocompaction and stop it on highload node.

Does the load decrease and the node = answers requests =E2=80=9Cnormally=E2=80=9D when you do disable = auto-compaction? You actually see pending compactions on nodes having = high load correct?

Heap is 8Gb. gc_grace is = 86400
All sstables is about 200-300 Mb.

All seems legit here. Using G1 GC?

$ nodetool compactionstats
pending tasks: 14

Try to increase the = compactors from 4 to 6-8 on a node, disable gossip and let it finish = compacting and put it back in the ring by enabling gossip. See what = happens.

The = tombstones count growing is because the auto-aucompactions are disabled = on these nodes. Probably not your issue.

   J.



$ = dstat -lvnr 10
---load-avg--- ---procs--- = ------memory-usage----- ---paging-- -dsk/total- ---system-- = ----total-cpu-usage---- -net/total- --io/total-
1m =   5m  15m |run blk new| used  buff  cach =  free|  in   out | read  writ| int =   csw |usr sys idl wai hiq siq| recv  send| read =  writ
29.4 28.6 23.5|0.0   0 1.2|11.3G =  190M 17.6G  407M|   0     0 = |7507k 7330k|  13k   40k| 11   1  88 =   0   0   0|   0 =     0 |96.5  64.6
29.3 28.6 23.5| = 29   0 0.9|11.3G  190M 17.6G  408M|   0 =     0 |   0   189k|9822 =  2319 | 99   0   0   0   0 =   0| 138k  120k|   0  4.30
29.4 28.6 23.6| 30   0 2.0|11.3G  190M 17.6G =  408M|   0     0 |   0 =    26k|8689  2189 |100   0   0 =   0   0   0| 139k  120k| =   0  2.70
29.4 28.7 23.6| 29   0 = 3.0|11.3G  190M 17.6G  408M|   0 =     0 |   0    20k|8722 =  1846 | 99   0   0   0   0 =   0| 136k  120k|   0  1.50 ^C


JvmTop 0.8.0 alpha - 15:20:37, =  amd64, 16 cpus, Linux 3.14.44-3, load avg 28.09
http://code.google.com/p/jvmtop

PID 32505: org.apache.cassandra.service.CassandraDaemon
ARGS:
VMARGS: -ea = -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSCl[...]
VM: Oracle Corporation Java HotSpot(TM) 64-Bit Server VM = 1.8.0_65
UP:  8:31m  #THR: 334  #THRPEAK: = 437  #THRCREATED: 4694 USER: cassandra
GC-Time: =  0: 8m   #GC-Runs: 6378 =      #TotalLoadedClasses: 5926
CPU:= 97.96% GC:  0.00% HEAP:6049m /7540m NONHEAP:  82m / =  n/a

 TID   NAME =             &n= bsp;           &nbs= p;          STATE =    CPU  TOTALCPU BLOCKEDBY
   447 SharedPool-Worker-45 =             &n= bsp;   RUNNABLE 60.47%     1.03%
   343 SharedPool-Worker-2 =             &n= bsp;    RUNNABLE 56.46% =     3.07%
   349 = SharedPool-Worker-8 =             &n= bsp;    RUNNABLE 56.43% =     1.61%
   456 = SharedPool-Worker-25 =             &n= bsp;   RUNNABLE 55.25%     1.06%
   483 SharedPool-Worker-40 =             &n= bsp;   RUNNABLE 53.06%     1.04%
   475 SharedPool-Worker-53 =             &n= bsp;   RUNNABLE 52.31%     1.03%
   464 SharedPool-Worker-20 =             &n= bsp;   RUNNABLE 52.00%     1.11%
   577 SharedPool-Worker-71 =             &n= bsp;   RUNNABLE 51.73%     1.02%
   404 SharedPool-Worker-10 =             &n= bsp;   RUNNABLE 51.10%     1.29%
   486 SharedPool-Worker-34 =             &n= bsp;   RUNNABLE 51.06%     1.03%
Note: Only top 10 threads (according cpu load) are shown!


On 12 Feb 2016, at 18:14, Julien Anguenot <julien@anguenot.org>= wrote:

At the time when the load is high = and you have to restart, do you see any pending compactions when using = `nodetool compactionstats`?

Possible to see = a `nodetool compactionstats` taken *when* the load is too high? =  Have you checked the size of your SSTables for that big table? Any = large ones in there?  What about the Java HEAP configuration on = these nodes?

If you have too many = tombstones I would try to decrease gc_grace_seconds so they get cleared = out earlier during compactions.

 J.

On Feb = 12, 2016, at 8:45 AM, Skvazh Roman <r@skvazh.com> wrote:

There = is 1-4 compactions at that moment.
We have many = tombstones, which does not removed.
DroppableTombstoneRatio = is 5-6 (greater than 1)

On 12 Feb 2016, at 15:53, Julien Anguenot = <julien@anguenot.org> wrote:

Hey, 

What about compactions count when that is = happening?

J.


On Feb = 12, 2016, at 3:06 AM, Skvazh Roman <r@skvazh.com> wrote:

Hello!
We have a cluster of 25 c3.4xlarge nodes = (16 cores, 32 GiB) with attached 1.5 TB 4000 PIOPS EBS drive.
Sometimes one or two nodes user cpu spikes to 100%, load = average to 20-30 - read requests drops of.
Only restart of = this cassandra services helps.
Please advice.

One big table with wide rows. 600 Gb per = node.
LZ4Compressor
LeveledCompaction

concurrent compactors: 4
compactor= throughput: tried from 16 to 128
Concurrent_readers: from = 16 to 32
Concurrent_writers: 128


https://gist.github.com/rskvazh/de916327779b98a437a6


JvmTop 0.8.0 alpha - 06:51:10, =  amd64, 16 cpus, Linux 3.14.44-3, load avg 19.35
http://code.google.com/p/jvmtop

Profiling PID 9256: = org.apache.cassandra.service.CassandraDa

95.73% (     4.31s) = ....google.common.collect.AbstractIterator.tryToComputeN()
1.39% (     0.06s) = com.google.common.base.Objects.hashCode()
1.26% ( =     0.06s) = io.netty.channel.epoll.Native.epollWait()
0.85% ( =     0.04s) = net.jpountz.lz4.LZ4JNI.LZ4_compress_limitedOutput()
0.46% = (     0.02s) = net.jpountz.lz4.LZ4JNI.LZ4_decompress_fast()
0.26% ( =     0.01s) = com.google.common.collect.Iterators$7.computeNext()
0.06% = (     0.00s) = io.netty.channel.epoll.Native.eventFdWrite()


ttop:

2016-02-12T08:20:25.605+0000 Process summary
process cpu=3D1565.15%
application cpu=3D1314.48%= (user=3D1354.48% sys=3D-40.00%)
other: cpu=3D250.67%
heap allocation rate 146mb/s
[000405] = user=3D76.25% sys=3D-0.54% alloc=3D     0b/s - = SharedPool-Worker-9
[000457] user=3D75.54% sys=3D-1.26% = alloc=3D     0b/s - SharedPool-Worker-14
[000451] user=3D73.52% sys=3D 0.29% alloc=3D =     0b/s - SharedPool-Worker-16
[000311]= user=3D76.45% sys=3D-2.99% alloc=3D     0b/s - = SharedPool-Worker-4
[000389] user=3D70.69% sys=3D 2.62% = alloc=3D     0b/s - SharedPool-Worker-6
[000388] user=3D86.95% sys=3D-14.28% alloc=3D =     0b/s - SharedPool-Worker-5
[000404] = user=3D70.69% sys=3D 0.10% alloc=3D     0b/s - = SharedPool-Worker-8
[000390] user=3D72.61% sys=3D-1.82% = alloc=3D     0b/s - SharedPool-Worker-7
[000255] user=3D87.86% sys=3D-17.87% alloc=3D =     0b/s - SharedPool-Worker-1
[000444] = user=3D72.21% sys=3D-2.30% alloc=3D     0b/s - = SharedPool-Worker-12
[000310] user=3D71.50% sys=3D-2.31% = alloc=3D     0b/s - SharedPool-Worker-3
[000445] user=3D69.68% sys=3D-0.83% alloc=3D =     0b/s - SharedPool-Worker-13
[000406]= user=3D72.61% sys=3D-4.40% alloc=3D     0b/s - = SharedPool-Worker-10
[000446] user=3D69.78% sys=3D-1.65% = alloc=3D     0b/s - SharedPool-Worker-11
[000452] user=3D66.86% sys=3D 0.22% alloc=3D =     0b/s - SharedPool-Worker-15
[000256]= user=3D69.08% sys=3D-2.42% alloc=3D     0b/s - = SharedPool-Worker-2
[004496] user=3D29.99% sys=3D 0.59% = alloc=3D   30mb/s - CompactionExecutor:15
[004906]= user=3D29.49% sys=3D 0.74% alloc=3D   39mb/s - = CompactionExecutor:16
[010143] user=3D28.58% sys=3D 0.25% = alloc=3D   26mb/s - CompactionExecutor:17
[000785]= user=3D27.87% sys=3D 0.70% alloc=3D   38mb/s - = CompactionExecutor:12
[012723] user=3D 9.09% sys=3D 2.46% = alloc=3D 2977kb/s - RMI TCP Connection(2673)-127.0.0.1
[000555] user=3D 5.35% sys=3D-0.08% alloc=3D  474kb/s - = SharedPool-Worker-24
[000560] user=3D 3.94% sys=3D 0.07% = alloc=3D  434kb/s - SharedPool-Worker-22
[000557] = user=3D 3.94% sys=3D-0.17% alloc=3D  339kb/s - = SharedPool-Worker-25
[000447] user=3D 2.73% sys=3D 0.60% = alloc=3D  436kb/s - SharedPool-Worker-19
[000563] = user=3D 3.33% sys=3D-0.04% alloc=3D  460kb/s - = SharedPool-Worker-20
[000448] user=3D 2.73% sys=3D 0.27% = alloc=3D  414kb/s - SharedPool-Worker-21
[000554] = user=3D 1.72% sys=3D 0.70% alloc=3D  232kb/s - = SharedPool-Worker-26
[000558] user=3D 1.41% sys=3D 0.39% = alloc=3D  213kb/s - SharedPool-Worker-23
[000450] = user=3D 1.41% sys=3D-0.03% alloc=3D  158kb/s - = SharedPool-Worker-17






--
Julien = Anguenot (@anguenot)
FR +33.7.86.85.70.44


= --Apple-Mail=_A3720B53-570D-47FE-80CB-4580CD73185B--