Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E4698F2B5 for ; Wed, 21 Aug 2013 15:14:29 +0000 (UTC) Received: (qmail 3224 invoked by uid 500); 21 Aug 2013 15:14:26 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 3032 invoked by uid 500); 21 Aug 2013 15:14:26 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 3024 invoked by uid 99); 21 Aug 2013 15:14:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Aug 2013 15:14:26 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kwright@nanigans.com designates 216.82.251.8 as permitted sender) Received: from [216.82.251.8] (HELO mail1.bemta12.messagelabs.com) (216.82.251.8) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Aug 2013 15:14:18 +0000 Received: from [216.82.250.115:51820] by server-8.bemta-12.messagelabs.com id 66/A9-24862-339D4125; Wed, 21 Aug 2013 15:13:55 +0000 X-Env-Sender: kwright@nanigans.com X-Msg-Ref: server-7.tower-127.messagelabs.com!1377098029!8259426!6 X-Originating-IP: [216.166.12.98] X-StarScan-Received: X-StarScan-Version: 6.9.11; banners=-,-,- X-VirusChecked: Checked Received: (qmail 3668 invoked from network); 21 Aug 2013 15:13:54 -0000 Received: from out001.collaborationhost.net (HELO out001.collaborationhost.net) (216.166.12.98) by server-7.tower-127.messagelabs.com with RC4-SHA encrypted SMTP; 21 Aug 2013 15:13:54 -0000 Received: from AUSP01VMBX28.collaborationhost.net ([192.168.20.73]) by AUSP01MHUB07.collaborationhost.net ([10.2.8.242]) with mapi; Wed, 21 Aug 2013 10:13:13 -0500 From: Keith Wright To: "user@cassandra.apache.org" Date: Wed, 21 Aug 2013 10:13:13 -0500 Subject: Re: Nodes get stuck Thread-Topic: Nodes get stuck Thread-Index: Ac6egPR/9IRftU2JQk2YdG29T1OTCQ== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.3.120616 acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_CE3A50E2151BAkwrightnaniganscom_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_CE3A50E2151BAkwrightnaniganscom_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Hi all, One more question for you related to this issue. We are no longer rea= ding from the global_user table which showed the very high latency times vi= a cfstats but we are still writing and therefore compacting. I still see h= igh load on the effected nodes (and tpstats showing pending read stage acti= ons) which I assume is due to compactions running against the global user t= able? In other words, is it expected that the same interval tree issue wou= ld occur during compactions? I assume yes but wanted to make sure. Thanks From: Keith Wright > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, August 21, 2013 9:48 AM To: "user@cassandra.apache.org" > Subject: Re: Nodes get stuck Are many people running at 1.2.8? Any issues? Just nervous about running = on the latest. Would prefer to be a couple of versions behind as new bugs = due to tend to popup. Thanks all From: Nate McCall > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, August 21, 2013 9:33 AM To: "user@cassandra.apache.org" > Subject: Re: Nodes get stuck We use 128m and gc_grace of 300 on a CF with highly transient data (holds v= alues for client locking via wait-chain algorithm implementation in hector)= . Same minor version upgrades should be painless and do-able with no downtime= . On Wed, Aug 21, 2013 at 8:28 AM, Keith Wright > wrote: We have our LCS sstable size at 64 MB and gc grace at 86400. May I ask wha= t values you use? I saw that in 2.0 they are setting LCS default sstable s= ize to 160 MB. Does anyone see any risk in upgrading from 1.2.4 to 1.2.8? Upgrade steps d= o not appear to mention any actions required and that a rolling upgrade sho= uld be safe From: Nate McCall > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, August 21, 2013 9:07 AM To: Cassandra Users > Subject: Re: Nodes get stuck Hit send before i saw your update. Yeah turn gc grace way down. You can tur= n your level size up a lot as well. On Aug 21, 2013 7:55 AM, "Keith Wright" > wrote: So the stack appears to be related to walking tombstones for a fetch. Can = you please give me your take on if this is a plausible explanation: * Given our data model, we can experience wide rows. We protect against= these by randomly reading a portion on write and if the size is beyond a c= ertain threshold, we delete data * This worked VERY well for some time now however perhaps we hit a row t= hat we deleted and has many tombstones. The row is being requests frequent= ly so Cassandra is working very hard to process through all of its tombston= es (currently the RF # of nodes are at high load which again suggests this)= . Question is what to do about it? This is an LCS table with gc grace second= s at 86400. I assume my only options are to force a major compaction via n= odetool compaction or upgrades stables? How can I validate this is the cau= se? How can I prevent it going forward? Set the gc grace seconds to a muc= h lower value for that table? Thanks all! From: Keith Wright > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, August 21, 2013 8:31 AM To: "user@cassandra.apache.org" > Subject: Re: Nodes get stuck Thank you for responding. I did a quick look and my mutation stage threads= are currently in TIMED_WAITING (as expected since tpstats shows no active = or pending) however most of my read stage threads are Runnable with the sta= ck traces below. I haven't dug into them yet but thought I would put them = out there to see if anyone had any ideas since we are currently in a produc= tion down state. Thanks all! Most have the first stack: java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107) org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompo= siteType.java:69) org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompo= siteType.java:31) java.util.TimSort.countRunAndMakeAscending(TimSort.java:329) java.util.TimSort.sort(TimSort.java:203) java.util.TimSort.sort(TimSort.java:173) java.util.Arrays.sort(Arrays.java:659) java.util.Collections.sort(Collections.java:217) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:255) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:280) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:280) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:281) org.apache.cassandra.utils.IntervalTree.(IntervalTree.java:72) org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81) org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175) org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractTh= readUnsafeSortedColumns.java:40) org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContai= ner.java:51) org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224) org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182) org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154) org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.ja= va:143) org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.ja= va:122) org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterato= r.java:96) com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterato= r.java:143) com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:13= 8) org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(Slice= QueryFilter.java:157) org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:= 136) org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.ja= va:84) org.apache.cassandra.db.CollationController.collectAllData(CollationControl= ler.java:293) org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationCon= troller.java:65) org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilySt= ore.java:1357) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore= .java:1214) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore= .java:1126) org.apache.cassandra.db.Table.getRow(Table.java:347) org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.ja= va:70) org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44) org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:5= 6) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1= 145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:= 615) java.lang.Thread.run(Thread.java:722) Name: ReadStage:1719 State: RUNNABLE Total blocked: 1,005 Total waited: 913 Stack trace: org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:252) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:280) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:280) org.apache.cassandra.utils.IntervalTree.(IntervalTree.java:72) org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81) org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175) org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractTh= readUnsafeSortedColumns.java:40) org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContai= ner.java:51) org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224) org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182) org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154) org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.ja= va:143) org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.ja= va:122) org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterato= r.java:96) com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterato= r.java:143) com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:13= 8) org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(Slice= QueryFilter.java:157) org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:= 136) org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.ja= va:84) org.apache.cassandra.db.CollationController.collectAllData(CollationControl= ler.java:293) org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationCon= troller.java:65) org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilySt= ore.java:1357) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore= .java:1214) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore= .java:1126) org.apache.cassandra.db.Table.getRow(Table.java:347) org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.ja= va:70) org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44) org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:5= 6) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1= 145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:= 615) java.lang.Thread.run(Thread.java:722) Name: ReadStage:1722 State: RUNNABLE Total blocked: 1,001 Total waited: 897 Stack trace: org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58) org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26) org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(Abstr= actType.java:229) org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompo= siteType.java:81) org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompo= siteType.java:31) java.util.TimSort.binarySort(TimSort.java:265) java.util.TimSort.sort(TimSort.java:208) java.util.TimSort.sort(TimSort.java:173) java.util.Arrays.sort(Arrays.java:659) java.util.Collections.sort(Collections.java:217) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:255) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:280) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:281) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:280) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:281) org.apache.cassandra.utils.IntervalTree$IntervalNode.(IntervalTree.ja= va:280) org.apache.cassandra.utils.IntervalTree.(IntervalTree.java:72) org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81) org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175) org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractTh= readUnsafeSortedColumns.java:40) org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContai= ner.java:51) org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224) org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182) org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154) org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.ja= va:143) org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.ja= va:122) org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterato= r.java:96) com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterato= r.java:143) com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:13= 8) org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(Slice= QueryFilter.java:157) org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:= 136) org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.ja= va:84) org.apache.cassandra.db.CollationController.collectAllData(CollationControl= ler.java:293) org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationCon= troller.java:65) org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilySt= ore.java:1357) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore= .java:1214) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore= .java:1126) org.apache.cassandra.db.Table.getRow(Table.java:347) org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.ja= va:70) org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44) org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:5= 6) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1= 145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:= 615) java.lang.Thread.run(Thread.java:722) From: Sylvain Lebresne > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, August 21, 2013 6:21 AM To: "user@cassandra.apache.org" > Subject: Re: Nodes get stuck A thread dump on one of the machine that has a suspiciously high CPU might = help figuring out what it is that is taking all that CPU. On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright > wrote: Some last minute info on this to hopefully enlighten. We are doing ~200 re= ads and writes across our 7 node SSD cluster right now (usually can do clos= er to 20K reads at least) and seeing CPU load as follows for the nodes (wit= h some par new to give an idea of GC): 001 =96 1200% (Par New at 120 ms / sec) 002 =96 6% (Par New at 0) 003 =96 600% (Par New at 45 ms / sec) 004 =96 900% 005 =96 500% 006 =96 10% 007 =96 130% There are no compactions running on 001 however I did see a broken pipe err= or in the logs there (see below). Netstats for 001 shows nothing pending. = It appears that all of the load/latency is related to one column family. = You can see cfstats & cfhistograms output below and note that we are using = LCS. I have brought the odd cfhistograms behavior to the thread before and= am not sure what's going on there. We are in a production down situation = right now so any help would be much appreciated!!! Column Family: global_user SSTable count: 7546 SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0] Space used (live): 83848742562 Space used (total): 83848742562 Number of Keys (estimate): 549792896 Memtable Columns Count: 526746 Memtable Data Size: 117408252 Memtable Switch Count: 0 Read Count: 11673 Read Latency: 1950.062 ms. Write Count: 118588 Write Latency: 0.080 ms. Pending Tasks: 0 Bloom Filter False Positives: 4322 Bloom Filter False Ratio: 0.84066 Bloom Filter Space Used: 383507440 Compacted row minimum size: 73 Compacted row maximum size: 2816159 Compacted row mean size: 324 [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user users/global_user histograms Offset SSTables Write Latency Read Latency Row Size = Column Count 1 8866 0 0 0 = 3420 2 1001 0 0 0 = 99218975 3 1249 0 0 0 = 319713048 4 1074 0 0 0 = 25073893 5 132 0 0 0 = 15359199 6 0 0 0 0 = 27794925 7 0 12 0 0 = 7954974 8 0 23 0 0 = 7733934 10 0 184 0 0 = 13276275 12 0 567 0 0 = 9077508 14 0 1098 0 0 = 5879292 17 0 2722 0 0 = 5693471 20 0 4379 0 0 = 3204131 24 0 8928 0 0 = 2614995 29 0 13525 0 0 = 1824584 35 0 16759 0 0 = 1265911 42 0 17048 0 0 = 868075 50 0 14162 5 0 = 596417 60 0 11806 15 0 = 467747 72 0 8569 108 0 = 354276 86 0 7042 276 227 = 269987 103 0 5936 372 2972 = 218931 124 0 4538 577 157 = 181360 149 0 2981 1076 7388090 = 144298 179 0 1929 1529 90535838 = 116628 215 0 1081 1450 182701876 = 93378 258 0 499 1125 141393480 = 74052 310 0 124 756 18883224 = 58617 372 0 31 460 24599272 = 45453 446 0 25 247 23516772 = 34310 535 0 10 146 13987584 = 26168 642 0 20 194 12091458 = 19965 770 0 8 196 9269197 = 14649 924 0 9 340 8082898 = 11015 1109 0 9 225 4762865 = 8058 1331 0 9 154 3330110 = 5866 1597 0 8 144 2367615 = 4275 1916 0 1 188 1633608 = 3087 2299 0 4 216 1139820 = 2196 2759 0 5 201 819019 = 1456 3311 0 4 194 600522 = 1135 3973 0 6 181 454566 = 786 4768 0 13 136 353886 = 587 5722 0 6 152 280630 = 400 6866 0 5 80 225545 = 254 8239 0 6 112 183285 = 138 9887 0 0 68 149820 = 109 11864 0 5 99 121722 = 66 14237 0 57 86 98352 = 50 17084 0 18 99 79085 = 35 20501 0 1 93 62423 = 11 24601 0 0 61 49471 = 9 29521 0 0 69 37395 = 5 35425 0 4 56 28611 = 6 42510 0 0 57 21876 = 1 51012 0 9 60 16105 = 0 61214 0 0 52 11996 = 0 73457 0 0 50 8791 = 0 88148 0 0 38 6430 = 0 105778 0 0 25 4660 = 0 126934 0 0 15 3308 = 0 152321 0 0 2 2364 = 0 182785 0 0 0 1631 = 0 219342 0 0 0 1156 = 0 263210 0 0 0 887 = 0 315852 0 0 0 618 = 0 379022 0 0 0 427 = 0 454826 0 0 0 272 = 0 545791 0 0 0 168 = 0 654949 0 0 0 115 = 0 785939 0 0 0 61 = 0 943127 0 0 0 58 = 0 1131752 0 0 0 34 = 0 1358102 0 0 0 19 = 0 1629722 0 0 0 9 = 0 1955666 0 0 0 4 = 0 2346799 0 0 0 5 = 0 2816159 0 0 0 2 = 0 3379391 0 0 0 0 = 0 4055269 0 0 0 0 = 0 4866323 0 0 0 0 = 0 5839588 0 0 0 0 = 0 7007506 0 0 0 0 = 0 8409007 0 0 0 0 = 0 10090808 0 0 0 0 = 0 12108970 0 0 0 0 = 0 14530764 0 0 0 0 = 0 17436917 0 0 0 0 = 0 20924300 0 0 0 0 = 0 25109160 0 0 0 0 = 0 ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450 Outbou= ndTcpConnection.java (line 197) error writing to /10.8.44.98 java.lang.RuntimeException: java.io.IOException: Broken pipe at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java= :59) at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java= :30) at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySer= ializer.java:73) at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62) at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.ja= va:78) at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.ja= va:69) at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131) at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnecti= on.java:221) at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTc= pConnection.java:186) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection= .java:144) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450) at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) at java.nio.channels.Channels.writeFully(Channels.java:98) at java.nio.channels.Channels.access$000(Channels.java:61) at java.nio.channels.Channels$1.write(Channels.java:174) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297) at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:24= 4) at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328) at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil= .java:315) at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java= :55) ... 9 more From: Keith Wright > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, August 21, 2013 2:35 AM To: "user@cassandra.apache.org" > Subject: Re: Nodes get stuck Still looking for help! We have stopped almost ALL traffic to the cluster = and still some nodes are showing almost 1000% CPU for cassandra with no ios= tat activity. We were running cleanup on one of the nodes that was not sh= owing load spikes however now when I attempt to stop cleanup there via node= tool stop cleanup the java task for stopping cleanup itself is at 1500% and= has not returned after 2 minutes. This is VERY odd behavior. Any ideas? = Hardware failure? Network? We are not seeing anything there but wanted t= o get ideas. Thanks From: Keith Wright > Reply-To: "user@cassandra.apache.org" > Date: Tuesday, August 20, 2013 8:32 PM To: "user@cassandra.apache.org" > Subject: Nodes get stuck Hi all, We are using C* 1.2.4 with Vnodes and SSD. We have seen behavior recen= tly where 3 of our nodes get locked up in high load in what appears to be a= GC spiral while the rest of the cluster (7 total nodes) appears fine. Whe= n I run a tpstats, I see the following (assuming tpstats returns at all) an= d top shows cassandra pegged at 2000%. Obviously we have a large number of= blocked reads. In the past I could explain this due to unexpectedly wide = rows however we have handled that. When the cluster starts to meltdown lik= e this its hard to get visibility into what's going on and what triggered t= he issue as everything starts to pile on. Opscenter becomes unusable and b= ecause the effected nodes are in GC pressure, getting any data via nodetool= or JMX is also difficult. What do people do to handle these situations? = We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes = that it helps. Thanks Pool Name Active Pending Completed Blocked All= time blocked ReadStage 256 381 1245117434 0 = 0 RequestResponseStage 0 0 1161495947 0 = 0 MutationStage 8 8 481721887 0 = 0 ReadRepairStage 0 0 85770600 0 = 0 ReplicateOnWriteStage 0 0 21896804 0 = 0 GossipStage 0 0 1546196 0 = 0 AntiEntropyStage 0 0 5009 0 = 0 MigrationStage 0 0 1082 0 = 0 MemtablePostFlusher 0 0 10178 0 = 0 FlushWriter 0 0 6081 0 = 2075 MiscStage 0 0 57 0 = 0 commitlog_archiver 0 0 0 0 = 0 AntiEntropySessions 0 0 0 0 = 0 InternalResponseStage 0 0 6 0 = 0 HintedHandoff 1 1 246 0 = 0 Message type Dropped RANGE_SLICE 482 READ_REPAIR 0 BINARY 0 READ 515762 MUTATION 39 _TRACE 0 REQUEST_RESPONSE 29 --_000_CE3A50E2151BAkwrightnaniganscom_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
Hi all,

    One more question for you related to this issue.   We = are no longer reading from the global_user table which showed the very high= latency times via cfstats but we are still writing and therefore compactin= g.  I still see high load on the effected nodes (and tpstats showing p= ending read stage actions) which I assume is due to compactions running aga= inst the global user table?  In other words, is it expected that the s= ame interval tree issue would occur during compactions?  I assume yes = but wanted to make sure.

Thanks

From: Keith Wright <kwright@nanigans.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, August 21, 2013 9:48 AMTo: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Nodes get stuck
<= div>
Are many people running a= t 1.2.8?  Any issues?  Just nervous about running on the latest. =  Would prefer to be a couple of versions behind as new bugs due to ten= d to popup.

Thanks all

From: Nate McCall <nate@thelastpickle.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, August 21, 2013 9:33 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Nodes get stuck

We use 128m and gc_grace of 300 on a CF with = highly transient data (holds values for client locking via wait-chain algor= ithm implementation in hector). 

Same minor version upgrades should be painless and do-a= ble with no downtime. 


<= div class=3D"gmail_quote">On Wed, Aug 21, 2013 at 8:28 AM, Keith Wright <kwright@nanig= ans.com> wrote:
We have our LCS sstable size at 64 MB and gc grace at 86400.  May I = ask what values you use?  I saw that in 2.0 they are setting LCS defau= lt sstable size to 160 MB.

Does anyone see any ris= k in upgrading from 1.2.4 to 1.2.8?  Upgrade steps do not appear to me= ntion any actions required and that a rolling upgrade should be safe
<= div>
From: <= /span>Nate McCall <nate@thelastpickle.com>
R= eply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>Date: Wednesday, August 21, 2013 = 9:07 AM
To: Cassandra Users <= user@cassand= ra.apache.org>

Subject: Re: Nodes get stuck

<= /div>

Hit send before i saw your update. Yeah turn = gc grace way down. You can turn your level size up a lot as well.

On Aug 21, 2013 7:55 AM, "Keith Wright" = <kwright@nanig= ans.com> wrote:
So the stack appears to be related to walking tombstones for= a fetch.  Can you please give me your take on if this is a plausible = explanation:
  • Given our data model, we can experience wide rows= .  We protect against these by randomly reading a portion on write and= if the size is beyond a certain threshold, we delete data
  • This wor= ked VERY well for some time now however perhaps we hit a row that we delete= d and has many tombstones.  The row is being requests frequently so Ca= ssandra is working very hard to process through all of its tombstones (curr= ently the RF # of nodes are at high load which again suggests this).
Question is what t= o do about it?  This is an LCS table with gc grace seconds at 86400. &= nbsp;I assume my only options are to force a major compaction via nodetool = compaction or upgrades stables?  How can I validate this is the cause?=  How can I prevent it going forward?  Set the gc grace seconds to a much lower value for that tab= le?

Thanks all!

From: Keith Wright <kwright@nanigans.com>
Reply-To: "
user@cassandra.apac= he.org" <user@cassandra.apache.org>
Date: Wednesday, August 21, 2013 8:31 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org= >
Subject: Re: Nodes get stuc= k

Thank you for responding.  = I did a quick look and my mutation stage threads are currently in TIMED_WAI= TING (as expected since tpstats shows no active or pending) however most of= my read stage threads are Runnable with the stack traces below.  I ha= ven't dug into them yet but thought I would put them out there to see if anyone = had any ideas since we are currently in a production down state.
=
Thanks all!

Most have the first sta= ck:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cas= sandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:= 69)
org.apa= che.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeTyp= e.java:31)
= java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(Ti= mSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Co= llections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalN= ode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$Interv= alNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$Int= ervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$= IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTr= ee.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(In= tervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.ca= ssandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSor= tedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractCol= umnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:2= 24)
org.apa= che.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassand= ra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.Me= rgeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.Merge= Iterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIte= rator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractI= terator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.Abstract= Iterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryF= ilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.= QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryF= ilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationControll= er.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationContr= oller.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFa= milyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.Co= lumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.C= olumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.= Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(Slice= FromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandle= r.java:44)
= org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:5= 6)
java.uti= l.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.conc= urrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run= (Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocke= d: 1,005  Total waited: 913

Stack trace: = ;
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init&g= t;(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree= $IntervalNode.<init>(IntervalTree.java:280)
org.apache.cass= andra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.= java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTr= ee.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo= .java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColu= mns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.= cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51= )
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:= 224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFi= lter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNe= xt(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator= $Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.u= tils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.= apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.ja= va:96)
com.google.common.collect.AbstractIterator.tryToComputeNex= t(AbstractIterator.java:143)
com.google.common.collect.AbstractIt= erator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db= .filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFil= ter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateO= nDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationC= ontroller.collectAllData(CollationController.java:293)
org.apache= .cassandra.db.CollationController.getTopLevelColumns(CollationController.ja= va:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColum= ns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFam= ilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.= cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)=
org.apache.cassandra.db.Table.getRow(Table.java:347)
o= rg.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.jav= a:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHand= ler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(Mes= sageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor= .runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.Thr= eadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang= .Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
=
Total blocked: 1,001  Total waited: 897

= Stack trace: 
org.apache.cassandra.db.marshal.Int32Type.comp= are(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.= compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.Abstra= ctType.compareCollectionMembers(AbstractType.java:229)
org.apache= .cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.j= ava:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.com= pare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(= TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
<= div>java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sor= t(Arrays.java:659)
java.util.Collections.sort(Collections.java:21= 7)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init&= gt;(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTre= e$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cas= sandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)<= /div>
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>= (IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$I= ntervalNode.<init>(IntervalTree.java:281)
org.apache.cassan= dra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.ja= va:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree= .java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.j= ava:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumn= s.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.ca= ssandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)<= /div>
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:22= 4)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilt= er.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext= (QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$C= andidate.advance(MergeIterator.java:143)
org.apache.cassandra.uti= ls.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.ap= ache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java= :96)
com.google.common.collect.AbstractIterator.tryToComputeNext(= AbstractIterator.java:143)
com.google.common.collect.AbstractIter= ator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.f= ilter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilte= r.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnD= iskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationCon= troller.collectAllData(CollationController.java:293)
org.apache.c= assandra.db.CollationController.getTopLevelColumns(CollationController.java= :65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns= (ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamil= yStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.ca= ssandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org= .apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:= 70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandle= r.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(Messa= geDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.r= unWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.Threa= dPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.T= hread.run(Thread.java:722)


<= span style=3D"font-weight:bold">From: Sylvain Lebresne <sylvain@datastax.com= >
Reply-To: "user@cassandra.apache.o= rg" <user@cassandra.apache.org>
= Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>=
Subject: Re: Nodes get stuck

A thread dump on one of th= e machine that has a suspiciously high CPU might help figuring out what it = is that is taking all that CPU. 

<= br>
On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright= <kwright@nanig= ans.com> wrote:
Some last minute info on this to hopefully enlighten.  We are doing = ~200 reads and writes across our 7 node SSD cluster right now (usually can = do closer to 20K reads at least) and seeing CPU load as follows for the nod= es (with some par new to give an idea of GC):

001 =96 1200%   (Par New at 120 ms / sec)
002 =96 6% (Par New at 0)
003 =96 60= 0% (Par New at 45 ms / sec)
004 =96 900%
= 005 =96 500%
006 =96 10%
007 =96 130%

There are no compactions running on 001 however I did see a broken = pipe error in the logs there (see below).  Netstats for 001 shows noth= ing pending.  It appears that all of the load/latency is related to on= e column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd = cfhistograms behavior to the thread before and am not sure what's going on = there.  We are in a production down situation right now so any help wo= uld be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 4= 53, 6975, 0, 0]
Space= used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
<= span style=3D"white-space:pre-wrap">Memtable Data Size: 117408252
Memtable Switch Count: = 0
Read Count: 11673
Read Latency: 1950.062= ms.
Write Count: 118= 588
Write Latency: 0.= 080 ms.
Pending Tasks= : 0
Bloom Filter Fals= e Positives: 4322
Blo= om Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
=
Compacted row mean size: 3= 24

[kwright@lxpcas001 ~]$ nodetool cfhi= stograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency   &nb= sp;  Read Latency          Row Size   &n= bsp;  Column Count
1           &nbs= p;   8866                 0 &n= bsp;               0      = ;           0           &= nbsp;  3420
2             &nbs= p; 1001                 0   &n= bsp;             0        = ;         0          99218975<= /div>
3               1249   &n= bsp;             0        = ;         0             &= nbsp;   0         319713048
4   &nb= sp;           1074         &nb= sp;       0              =   0                 0   =        25073893
5        =        132             &= nbsp;   0                 0 &n= bsp;               0      = ;    15359199
6           &nbs= p;      0               &= nbsp; 0                 0   &n= bsp;             0        = ;  27794925
7             &nbs= p;    0                12=                 0     &n= bsp;           0          = ; 7954974
8               &nbs= p;  0                23  =               0       &n= bsp;         0           77339= 34
10                 0 &= nbsp;             184       &n= bsp;         0            = ;     0          13276275
12 &= nbsp;               0     &nbs= p;         567            = ;     0                 0=           9077508
14     &nbs= p;           0           =    1098                 0=                 0     &n= bsp;     5879292
17           =       0              2722=                 0     &n= bsp;           0          = ; 5693471
20               &nb= sp; 0              4379     &n= bsp;           0          = ;       0           3204131
24                 0   &nbs= p;          8928          = ;       0               &= nbsp; 0           2614995
29   &nbs= p;             0         =     13525                = 0                 0     =       1824584
35          = ;       0             16759 &n= bsp;               0      = ;           0           1= 265911
42                = 0             17048       &nb= sp;         0            =     0            868075
= 50                 0     =         14162            =     5                 0 =            596417
60     =             0         &nb= sp;   11806                15 =                 0     &nb= sp;      467747
72         &nb= sp;       0              = 8569               108     &nb= sp;           0          =  354276
86              =   0              7042    = ;           276          =     227            269987
103                0    = ;          5936          =     372              2972 &nb= sp;          218931
124     &n= bsp;          0          =    4538               577 &nb= sp;             157       &nbs= p;    181360
149           &nb= sp;    0              2981 &nb= sp;            1076       &nbs= p;   7388090            144298
179                0    =          1929           =    1529          90535838     =        116628
215        =        0             &nb= sp;1081              1450     =     182701876             93378
258                0   =             499         &= nbsp;    1125         141393480    =         74052
310        = ;        0             &n= bsp; 124               756    =      18883224             586= 17
372                0 &= nbsp;              31     &nbs= p;         460          245992= 72             45453
446   &nb= sp;            0         =        25             &nb= sp; 247          23516772       &nb= sp;     34310
535           &n= bsp;    0                = 10               146      = ;    13987584             26168
642                0   =              20       &nb= sp;       194          12091458 &nb= sp;           19965
770     &n= bsp;          0          =       8               19= 6           9269197         &n= bsp;   14649
924             &= nbsp;  0                 9 &nb= sp;             340       &nbs= p;   8082898             11015
1109               0     &nbs= p;           9           =     225           4762865     =          8058
1331       =         0             &nb= sp;   9               154   &n= bsp;       3330110            =  5866
1597              = 0                 8     =           144           2= 367615              4275
1916 =               0       &nb= sp;         1            =   188           1633608      =        3087
2299        =       0               &n= bsp; 4               216     &= nbsp;     1139820              = ;2196
2759               0 &nb= sp;               5      =         201            8= 19019              1456
3311 &= nbsp;             0       &nbs= p;         4             =   194            600522     &n= bsp;        1135
3973       &n= bsp;       0              = ;   6               181   &nbs= p;        454566           &nb= sp;   786
4768             &nb= sp; 0                13   &nbs= p;           136          = ;  353886               587
5722               0     &n= bsp;           6          = ;     152            280630   =             400
6866     =           0           &nb= sp;     5                = 80            225545       &nb= sp;       254
8239         &nb= sp;     0                = 6               112      = ;      183285             &nbs= p; 138
9887               0 &n= bsp;               0      = ;          68           &= nbsp;149820               109
= 11864              0      = ;           5           &= nbsp;    99            121722  = ;              66
14237  =            0         &nb= sp;      57              =  86             98352     &nb= sp;          50
17084     &nbs= p;        0             &= nbsp;  18                99 &n= bsp;           79085         &= nbsp;      35
20501         &n= bsp;    0                = 1                93     =         62423            =    11
24601             =  0                 0   &n= bsp;            61        = ;     49471               &nbs= p; 9
29521              0 &nbs= p;               0       =          69           &nb= sp; 37395                 5
35425              0     &n= bsp;           4          = ;      56             28611 &n= bsp;               6
42510 &nb= sp;            0         =         0             &nb= sp;  57             21876     =             1
51012     &= nbsp;        0            = ;     9                60=             16105        = ;         0
61214        =      0               &nb= sp; 0                52   &nbs= p;         11996           &nb= sp;     0
73457           &nbs= p;  0                 0  =              50       &n= bsp;      8791             &nb= sp;   0
88148             &nbs= p;0                 0    =            38         &n= bsp;    6430               &nb= sp; 0
105778             0   &= nbsp;             0       &nbs= p;        25             =  4660                 0
<= div>126934             0      =           0           &n= bsp;    15              3308 &= nbsp;               0
152321 &= nbsp;           0         &nbs= p;       0               =   2              2364    =             0
182785    =         0             &n= bsp;   0                 0 &nb= sp;            1631       &nbs= p;         0
219342       &nbs= p;     0                 = 0                 0     &= nbsp;        1156           &n= bsp;     0
263210           &n= bsp; 0                 0   &nb= sp;             0        =       887               =   0
315852             0  = ;               0       &= nbsp;         0           &nbs= p;   618                 0
379022             0     &nb= sp;           0          =       0               42= 7                 0
45482= 6             0         &= nbsp;       0             &nbs= p;   0               272   &nb= sp;             0
545791   &nb= sp;         0            =     0                 0 =               168       &= nbsp;         0
654949       &= nbsp;     0               &nbs= p; 0                 0    = ;           115          =       0
785939          =   0                 0   =               0       &nb= sp;        61            =     0
943127            = 0                 0     =             0         &nb= sp;      58              =   0
1131752            0 &nbs= p;               0       =           0           &nb= sp;    34                = 0
1358102            0   &nbs= p;             0         =         0             &nb= sp;  19                 0
1629722            0     &nbs= p;           0           =       0               &nb= sp; 9                 0
1= 955666            0       &nbs= p;         0             =     0                 4 &= nbsp;               0
2346799 =            0         &nbs= p;       0               =   0                 5   &= nbsp;             0
2816159   =          0           &nbs= p;     0                 = 0                 2     &= nbsp;           0
3379391     =        0             &nbs= p;   0                 0  = ;               0       &= nbsp;         0
4055269       =      0               &nbs= p; 0                 0    = ;             0         &= nbsp;       0
4866323         =    0                 0 &n= bsp;               0      = ;           0           &= nbsp;     0
5839588           =  0                 0   &n= bsp;             0        = ;         0             &= nbsp;   0
7007506            0=                 0     &n= bsp;           0          = ;       0               &= nbsp; 0
8409007            0  =               0       &n= bsp;         0            = ;     0                 0=
10090808           0     &nbs= p;           0           =       0               &nb= sp; 0                 0
1= 2108970           0         &n= bsp;       0              = ;   0                 0  =               0
14530764 &nbs= p;         0             =     0                 0 &= nbsp;               0     &nbs= p;           0
17436917     &n= bsp;     0                = ; 0                 0    =             0         &n= bsp;       0
20924300         =   0                 0   &= nbsp;             0       &nbs= p;         0             =     0
25109160           0 &nb= sp;               0      =           0           &n= bsp;     0                = ; 0

ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450 Outbound= TcpConnection.java (line 197) error writing to /10.8.44.98
java.lang.RuntimeException: j= ava.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSeri= alizer.java:59)
at or= g.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)<= /div>
at org.apache.cassand= ra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.d= b.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.ser= ialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadRes= ponse.java:69)
at org= .apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
<= span style=3D"white-space:pre-wrap">at org.apache.cassandra.net.Outb= oundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpC= onnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcp= Connection.run(OutboundTcpConnection.java:144)
Caused by: java.io= .IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write= (SocketDispatcher.java:47)
<= /span>at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
= at sun.nio.ch.IOUtil.write(IOUt= il.java:65)
at sun.ni= o.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFull= yImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.acce= ss$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flu= shBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStrea= m.java:126)
at org.xe= rial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
= at org.xerial.snappy.SnappyOutp= utStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(Sna= ppyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
= at org.apache.cassandra.utils.B= yteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeW= ithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnS= erializer.java:55)
..= . 9 more


Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org&g= t;
Subject: Re: Nodes get stuck<= br>

Still looking for help! =  We have stopped almost ALL traffic to the cluster and still some node= s are showing almost 1000% CPU for cassandra with no iostat activity.  = ; We were running cleanup on one of the nodes that was not showing load spi= kes however now when I attempt to stop cleanup there via nodetool stop cleanup the java ta= sk for stopping cleanup itself is at 1500% and has not returned after 2 min= utes.  This is VERY odd behavior.  Any ideas?  Hardware fail= ure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith= Wright <kwrig= ht@nanigans.com>
Reply-To: "user= @cassandra.apache.org" <user@cassandra.apache.org>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org" <user@cassandra.a= pache.org>
Subject: Nodes= get stuck

Hi all,

<= /div>
    We are using C* 1.2.4 with Vnodes and SSD.  We= have seen behavior recently where 3 of our nodes get locked up in high loa= d in what appears to be a GC spiral while the rest of the cluster (7 total = nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged= at 2000%.  Obviously we have a large number of blocked reads.  I= n the past I could explain this due to unexpectedly wide rows however we ha= ve handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and = what triggered the issue as everything starts to pile on.  Opscenter b= ecomes unusable and because the effected nodes are in GC pressure, getting = any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We ar= e going to start graphing reads/writes/sec/CF to Ganglia in the hopes that = it helps.

Thanks

Pool Name     &= nbsp;              Active   Pending=      Completed   Blocked  All time blocked
<= div>ReadStage                  = ;     256       381     1245117434  = ;       0               &= nbsp; 0
RequestResponseStage           &= nbsp;  0         0     1161495947  =       0               &n= bsp; 0
MutationStage             &n= bsp;       8         8     &nb= sp;481721887         0         &nbs= p;       0
ReadRepairStage       &n= bsp;           0         0 &nb= sp;     85770600         0     &nbs= p;           0
ReplicateOnWriteStage &nb= sp;           0         0 &nbs= p;     21896804         0      = ;           0
GossipStage     =                   0   &nb= sp;     0        1546196      =   0                 0
AntiEntropyStage                =  0         0           5= 009         0            =     0
MigrationStage          = ;          0         0   =         1082         0    = ;             0
MemtablePostFlusher=               0       &n= bsp; 0          10178         = 0                 0
Flush= Writer                    = ;   0         0          = 6081         0           &nbs= p;  2075
MiscStage            =             0         0 =             57         0 =                 0
commitl= og_archiver                0  =       0              0 &= nbsp;       0             &nbs= p;   0
AntiEntropySessions          = ;     0         0        =      0         0       &= nbsp;         0
InternalResponseStage   =           0         0   &= nbsp;          6         0 &nb= sp;               0
HintedHand= off                     1=         1            246=         0             &n= bsp;   0

Message type       &n= bsp;   Dropped
RANGE_SLICE          = ;      482
READ_REPAIR        =          0
BINARY       =                 0
READ &n= bsp;                  515762
MUTATION                 &= nbsp;  39
_TRACE             &= nbsp;         0
REQUEST_RESPONSE    = ;        29


<= /div>
<= /div>

--_000_CE3A50E2151BAkwrightnaniganscom_--