Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A89018D13 for ; Fri, 12 Aug 2011 02:11:39 +0000 (UTC) Received: (qmail 78649 invoked by uid 500); 12 Aug 2011 02:11:37 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 78608 invoked by uid 500); 12 Aug 2011 02:11:36 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 78600 invoked by uid 99); 12 Aug 2011 02:11:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Aug 2011 02:11:36 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a81.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Aug 2011 02:11:28 +0000 Received: from homiemail-a81.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTP id 4720CA806A for ; Thu, 11 Aug 2011 19:11:06 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; q=dns; s= thelastpickle.com; b=Tt85GgvnJz66tDlUfYp4Htbn9qspY4dw0ah41aVHJew 1kBNwkTTM0+hI7xFPgRXCXEgBHLO2OX/If6/XLoqhNXaK8MzlYraaMI//sVk0iKC /8s+XaOch4wXoOs2clrOUTlU4WSUknp9933Q3HT/lhCyUKSh+sNb6TnDcuaT2F4Y = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; s= thelastpickle.com; bh=LJQRra3KazaQFND7t8zStjNm04g=; b=drHSq11gES 1WRYtDZurxIsSwHbYkqV6rVZ/ujRjhjAnJx60MwduBUI2INkXjWlYvzoMANmVlVq WxcuVi5aGTPeKuUmwMkVtBS5Kb+h0ZNXYiDlgtMzTOgtlO1PTnzBwj8O29a8ANYK CpbmUYp4Q4uZCyWe36lUcXQCa6TaVTZNU= Received: from [10.0.1.150] (219-89-250-213.adsl.xtra.co.nz [219.89.250.213]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTPSA id B493CA8064 for ; Thu, 11 Aug 2011 19:11:05 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Apple Message framework v1244.3) Subject: Re: performance problems on new cluster From: aaron morton In-Reply-To: <4E447878.9010600@myrddin.org> Date: Fri, 12 Aug 2011 14:11:06 +1200 Content-Transfer-Encoding: quoted-printable Message-Id: <2B5FEBB9-E75C-48B4-B9B9-B7EB52D44E41@thelastpickle.com> References: <4E43F254.4020109@myrddin.org> <5D7B781B-D8BD-4B13-8426-84BFBFCDA88B@thelastpickle.com> <4E447878.9010600@myrddin.org> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1244.3) X-Virus-Checked: Checked by ClamAV on apache.org >=20 > iostat doesn't show a request queue bottleneck. The timeouts we are = seeing is for reads. The latency on the nodes I have temporarily used = for reads is around 2-45ms. The next token in the ring at an alternate = DC is showing ~4ms with everything else around 0.05ms. tpstats desn't = show any active/pending. Reads are at CL.ONE & Writes using CL.ANY OK, node latency is fine and you are using some pretty low consistency. = You said NTS with RF 2, is that RF 2 for each DC ?=20 The steps below may help get an idea of whats going on=85 1) use nodetool getendpoints to determine which replicas a key is. =20 2) connect directly to one of the endpoints with the CLI, ensure CL is = ONE and do your test query.=20 3) connect to another node in the same DC that is not a replica and do = the same.=20 4) connect to another node in a different DC and do the same=20 Once you can repo it try turning up the logging not the coordinator to = DEBUG you can do this via JConsole. Look for these lines=85. * Command/ConsistencyLevel is=85. * reading data locally... or reading data from=85 * reading digest locally=85 or reading digest for from=85 * Read timeout:=85 You'll also see some lines about receiving messages from other nodes. =20= Hopefully you can get an idea of which nodes are involved in a failing = query. Getting a thrift TimedOutException on a read with CL ONE is = pretty odd.=20 > What can I do in regards to confirming this issue is still outstanding = and/or we are affected by it? It's in 0.8 and will not be fixed. My unscientific approach was to = repair a single CF at a time, hoping that the differences would be = smaller and less data would be streamed.=20 Minor compaction should help squish things down. If you want to get more = aggressive reduce the min compaction threshold and trigger a minor = compaction with nodetool flush. =20 > Version of failure detection? I've not seen anything on this so I = suspect this is the default. Was asking so I could see if their were any fixed in Gossip or the = FailureDetect that you were missing. Check the CHANGES.txt file.=20 Hope that helps.=20 ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 12 Aug 2011, at 12:48, Anton Winter wrote: >=20 >> Is there a reason you are using the trunk and not one of the tagged = releases? Official releases are a lot more stable than the trunk. >>=20 > Yes, as we are using a combination of Ec2 and colo servers we are = needing to use broadcast_address from CASSANDRA-2491. The patch that is = associated with that JIRA does not apply cleanly against 0.8 so this is = why we are using trunk. >=20 >>> 1) thrift timeouts & general degraded response times >> For read or writes ? What sort of queries are you running ? Check the = local latency on each node using cfstats and cfhistogram, and a bit of = iostat http://spyced.blogspot.com/2010/01/linux-performance-basics.html = What does nodetool tpstats say, is there a stage backing up? >>=20 >> If the local latency is OK look at the cross DC situation. What CL = are you using? Are nodes timing out waiting for nodes in other DC's ? >=20 > iostat doesn't show a request queue bottleneck. The timeouts we are = seeing is for reads. The latency on the nodes I have temporarily used = for reads is around 2-45ms. The next token in the ring at an alternate = DC is showing ~4ms with everything else around 0.05ms. tpstats desn't = show any active/pending. Reads are at CL.ONE & Writes using CL.ANY >=20 >>=20 >>> 2) *lots* of exception errors, such as: >> Repair is trying to run on a response which is a digest response, = this should not be happening. Can you provide some more info on the type = of query you are running ? >>=20 > The query being run is get cf1['user-id']['seg'] >=20 >=20 >>> 3) ring imbalances during a repair (refer to the above nodetool ring = output) >> You may be seeing this >> https://issues.apache.org/jira/browse/CASSANDRA-2280 >> I think it's a mistake that is it marked as resolved. >>=20 > What can I do in regards to confirming this issue is still outstanding = and/or we are affected by it? >=20 >>> 4) regular failure detection when any node does something only = moderately stressful, such as a repair or are under light load etc. but = the node itself thinks it is fine. >> What version are you using ? >>=20 > Version of failure detection? I've not seen anything on this so I = suspect this is the default. >=20 >=20 > Thanks, > Anton >=20