Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 85077 invoked from network); 22 Apr 2010 08:18:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Apr 2010 08:18:59 -0000 Received: (qmail 79303 invoked by uid 500); 22 Apr 2010 08:18:58 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 79218 invoked by uid 500); 22 Apr 2010 08:18:57 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 79210 invoked by uid 99); 22 Apr 2010 08:18:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Apr 2010 08:18:56 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of alexli@gmail.com designates 74.125.83.172 as permitted sender) Received: from [74.125.83.172] (HELO mail-pv0-f172.google.com) (74.125.83.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Apr 2010 08:18:47 +0000 Received: by pvg7 with SMTP id 7so1038119pvg.31 for ; Thu, 22 Apr 2010 01:18:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:content-type :content-transfer-encoding; bh=qcMgNAdcBNvkTFLEelCeb6a28kPWVXwR8dfo8Zkjl6s=; b=aBGqmGbIKua3uc9EgjucN7spT3KhU91oAqd9nTDeArritnPBMYsPiT+NVjsBwn8Cz1 +sCtWdTfjA9Og/kMwCA/gwBvNTOeccowZ3P38Q/KpCEonqaqkJpCUgC/fZb3jj/wv+Z3 fETvxkQItE+VwrWfvglBadV+av7MfhlBhbFYY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=h8K3ir1ZXrb/U3vO0tTOpbwBzEvIxCxPl2386aLM10tWD82111FJfc1/xSqzmmA05D 7UN8WC1PSXTe2zi8OwMbM1dN8adrc4Z0U5bSi8wtz5hVNls4cQwbdAqyuP4HkQ/ZHnm/ upVVzNWBei9G3+1e4/1BulbdwM8hTMqArTdX4= Received: by 10.114.249.2 with SMTP id w2mr8788675wah.85.1271924305497; Thu, 22 Apr 2010 01:18:25 -0700 (PDT) Received: from [10.10.4.6] (173-11-109-37-SFBA.hfc.comcastbusiness.net [173.11.109.37]) by mx.google.com with ESMTPS id b17sm3233067wam.10.2010.04.22.01.18.24 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 22 Apr 2010 01:18:24 -0700 (PDT) Message-ID: <4BD0064E.5010704@gmail.com> Date: Thu, 22 Apr 2010 01:18:22 -0700 From: Alex Li User-Agent: Thunderbird 2.0.0.24 (X11/20100411) MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Periodically hiccups Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hello, We recently deployed a cluster of 5 Cassandra nodes into production, and ran into big problems with periodically hiccups (individual node goes down, high CPU, client connection timeout). It was terrible with 0.5 (one hiccups every 5-10 minutes), today we upgraded to 0.6.1, it happens less frequently now (likely once every 30 minutes or so). But it is still quite frustrating. We used ReplicationFactor=3 for all column families. 5 nodes are behind haproxy. Java client goes through haproxy. The most obvious behavior is: as soon as one node goes down, the connections between haproxy and Cassandra nodes just shoot up to 1000 (in normal case it is stable at 40, which should be really trivial for Cassandra), and the connections don't go down for quite a while. Meanwhile Java clients just get all kind of TimeoutException, then kept on retrying. Eventually we have to restart haproxy, then things go back to normal. Each node has 5GB max heap, powerful enough CPU (quad-core), software RAID mirror. We are definitely NOT putting lots of load yet, mostly 20-50 concurrent requests to Cassandra, but it is not holding up! Please help, we are on the verge of giving up Cassandra after 5 days of periodic "outage". Couple observations: - cfstats shows significant read latency on "system" keyspace, almost 5s (see below) - RecentReadLatencyMicros and RecentWriteLatencyMicros are super high for StorageProxy, as well as every column family in JMX: up to 152676.92 and 6950 (they are in ms, right?). However, in cfstats, they are quite small. - Every second we see 5-10 DigestMismatchException in the log: INFO [pool-1-thread-15857] 2010-04-22 00:37:37,887 StorageProxy.java (line 499) DigestMismatchException: Mismatch for key 1068022523 (d41d8cd98f00b204e9800998ecf8427e vs 0dd4cdaeeb1a334ae133c6955e109629) Please advice. Thank you! snippets of storage-conf, cfstats, tpstats are listed below: false org.apache.cassandra.locator.RackUnawareStrategy 3 org.apache.cassandra.locator.EndPointSnitch 10000 128 auto 512 64 32 8 64 64 256 0.3 60 8 32 periodic 10000 864000 root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost cfstats Keyspace: system Read Count: 878 Read Latency: 5752.042634396355 ms. Write Count: 2260398 Write Latency: 0.014567047926957996 ms. Pending Tasks: 0 Column Family: LocationInfo SSTable count: 2 Space used (live): 3569 Space used (total): 3569 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 1 Read Count: 1 Read Latency: NaN ms. Write Count: 6 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 2 Key cache size: 1 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: HintsColumnFamily SSTable count: 2 Space used (live): 70272035 Space used (total): 70272035 Memtable Columns Count: 56264 Memtable Data Size: 486854 Memtable Switch Count: 21 Read Count: 877 Read Latency: 13614.412 ms. Write Count: 2260392 Write Latency: 0.142 ms. Pending Tasks: 0 Key cache capacity: 2 Key cache size: 2 Key cache hit rate: 0.25 Row cache: disabled Compacted row minimum size: 78567 Compacted row maximum size: 39561901 Compacted row mean size: 27878603 ---------------- Keyspace: Titan Read Count: 8948702 Read Latency: 7.949136100185256 ms. Write Count: 3393490 Write Latency: 0.19255415398306758 ms. Pending Tasks: 0 Column Family: FbUser SSTable count: 6 Space used (live): 3675014807 Space used (total): 3675014807 Memtable Columns Count: 250055 Memtable Data Size: 9146339 Memtable Switch Count: 9 Read Count: 6591406 Read Latency: 8.361 ms. Write Count: 343030 Write Latency: 0.078 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 200000 Key cache hit rate: 0.6341912478864628 Row cache: disabled Compacted row minimum size: 320 Compacted row maximum size: 586 Compacted row mean size: 479 Column Family: Payment SSTable count: 3 Space used (live): 2473 Space used (total): 2473 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 27728 Read Latency: 0.059 ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: Club SSTable count: 6 Space used (live): 363609256 Space used (total): 477968226 Memtable Columns Count: 124134 Memtable Data Size: 38144903 Memtable Switch Count: 140 Read Count: 63668 Read Latency: 29.187 ms. Write Count: 1966098 Write Latency: 0.184 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 63751 Key cache hit rate: 0.6543073524771221 Row cache: disabled Compacted row minimum size: 274 Compacted row maximum size: 49784 Compacted row mean size: 7711 Column Family: RandomUsers SSTable count: 1 Space used (live): 1329464 Space used (total): 1329464 Memtable Columns Count: 170433 Memtable Data Size: 5525529 Memtable Switch Count: 8 Read Count: 58236 Read Latency: 6.215 ms. Write Count: 143339 Write Latency: 0.575 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 21 Key cache hit rate: 1.0 Row cache: disabled Compacted row minimum size: 1244196 Compacted row maximum size: 1327409 Compacted row mean size: 1287929 Column Family: User SSTable count: 4 Space used (live): 583812728 Space used (total): 583812728 Memtable Columns Count: 248833 Memtable Data Size: 7909687 Memtable Switch Count: 12 Read Count: 2207667 Read Latency: 8.413 ms. Write Count: 941029 Write Latency: 0.107 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 94940 Key cache hit rate: 0.9769078238427732 Row cache: disabled Compacted row minimum size: 259 Compacted row maximum size: 174340 Compacted row mean size: 3916 ---------------- Keyspace: Pandora Read Count: 1475530 Read Latency: 5.249669284257182 ms. Write Count: 856550 Write Latency: 0.16070848053236822 ms. Pending Tasks: 0 Column Family: Folder SSTable count: 7 Space used (live): 942926793 Space used (total): 942926793 Memtable Columns Count: 150119 Memtable Data Size: 814194 Memtable Switch Count: 21 Read Count: 604605 Read Latency: 7.539 ms. Write Count: 553015 Write Latency: 0.173 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 111884 Key cache hit rate: 0.9770064886992118 Row cache: disabled Compacted row minimum size: 234 Compacted row maximum size: 1644541 Compacted row mean size: 1238 Column Family: Attachment SSTable count: 5 Space used (live): 754692822 Space used (total): 754692822 Memtable Columns Count: 548 Memtable Data Size: 7110 Memtable Switch Count: 8 Read Count: 22708 Read Latency: 6.950 ms. Write Count: 29835 Write Latency: 0.025 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 8299 Key cache hit rate: 0.017857142857142856 Row cache: disabled Compacted row minimum size: 269 Compacted row maximum size: 274 Compacted row mean size: 272 Column Family: Message SSTable count: 7 Space used (live): 3689251440 Space used (total): 3689251440 Memtable Columns Count: 50399 Memtable Data Size: 1361773 Memtable Switch Count: 10 Read Count: 163630 Read Latency: 16.981 ms. Write Count: 257843 Write Latency: 0.113 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 52427 Key cache hit rate: 0.5041365046535677 Row cache: disabled Compacted row minimum size: 272 Compacted row maximum size: 2754 Compacted row mean size: 752 Column Family: FolderInfo SSTable count: 0 Space used (live): 0 Space used (total): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 8704 Read Latency: 0.008 ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: StandardByUUID1 SSTable count: 0 Space used (live): 0 Space used (total): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 8704 Read Latency: 0.005 ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: User SSTable count: 4 Space used (live): 120121309 Space used (total): 120121309 Memtable Columns Count: 1786 Memtable Data Size: 50657 Memtable Switch Count: 8 Read Count: 641067 Read Latency: 2.728 ms. Write Count: 15857 Write Latency: 0.148 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 45965 Key cache hit rate: 0.957460499034818 Row cache: disabled Compacted row minimum size: 271 Compacted row maximum size: 506 Compacted row mean size: 499 Column Family: Standard1 SSTable count: 0 Space used (live): 0 Space used (total): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 8704 Read Latency: 0.005 ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: Standard2 SSTable count: 0 Space used (live): 0 Space used (total): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 8704 Read Latency: 0.006 ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: Folder1 SSTable count: 0 Space used (live): 0 Space used (total): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 8704 Read Latency: 0.005 ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 ---------------- root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost tpstats Pool Name Active Pending Completed FILEUTILS-DELETE-POOL 0 0 70 STREAM-STAGE 0 0 0 RESPONSE-STAGE 0 0 18298968 ROW-READ-STAGE 1 0 10168985 LB-OPERATIONS 0 0 0 MESSAGE-DESERIALIZER-POOL 0 0 24960681 GMFD 0 0 93647 LB-TARGET 0 0 0 CONSISTENCY-MANAGER 0 0 0 ROW-MUTATION-STAGE 0 0 4280703 MESSAGE-STREAMING-POOL 0 0 12 LOAD-BALANCER-STAGE 0 0 0 FLUSH-SORTER-POOL 0 0 0 MEMTABLE-POST-FLUSHER 0 0 224 FLUSH-WRITER-POOL 0 0 224 AE-SERVICE-STAGE 0 0 22 HINTED-HANDOFF-POOL 1 5 73 > Here are some system load data I collected. top - 00:43:43 up 1 day, 20:15, 2 users, load average: 2.96, 2.74, 2.37 Tasks: 160 total, 1 running, 159 sleeping, 0 stopped, 0 zombie Cpu0 : 10.9%us, 2.6%sy, 0.0%ni, 83.2%id, 3.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 9.6%us, 2.2%sy, 0.0%ni, 85.4%id, 2.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 37.1%us, 4.2%sy, 0.0%ni, 23.3%id, 33.8%wa, 1.2%hi, 0.3%si, 0.0%st Cpu3 : 18.5%us, 3.9%sy, 0.0%ni, 62.6%id, 9.5%wa, 4.1%hi, 1.4%si, 0.0%st Cpu4 : 11.1%us, 2.6%sy, 0.0%ni, 80.1%id, 3.2%wa, 2.9%hi, 0.2%si, 0.0%st Cpu5 : 10.1%us, 2.7%sy, 0.0%ni, 84.4%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 20.1%us, 2.6%sy, 0.0%ni, 62.6%id, 14.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 18.9%us, 4.4%sy, 0.0%ni, 69.0%id, 7.7%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 8188108k total, 8135168k used, 52940k free, 12968k buffers Swap: 62500792k total, 277484k used, 62223308k free, 2381704k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 18000 root 20 0 15.6g 6.7g 1.5g S 64 85.4 552:54.13 java 1 root 20 0 19316 672 412 S 0 0.0 0:03.36 init root@cdb-006:/glass/sfw/cassandra# vmstat 5 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 5 1 277984 50180 12888 2382292 84 67 880 127 10 4 17 4 69 10 3 1 278000 51536 12856 2381316 4 14 2556 37 8660 10714 7 4 82 8 2 0 277968 52084 12768 2380156 7 2 891 618 6287 8865 6 3 87 3 3 0 277872 52020 12764 2379412 24 7 1070 22 5492 10111 8 3 84 4 0 1 277528 50764 12760 2379176 2 10 1530 696 8773 13661 8 4 83 5