Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F2679C1C for ; Mon, 13 Feb 2012 08:22:14 +0000 (UTC) Received: (qmail 25499 invoked by uid 500); 13 Feb 2012 08:22:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 25218 invoked by uid 500); 13 Feb 2012 08:21:56 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 25093 invoked by uid 99); 13 Feb 2012 08:21:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Feb 2012 08:21:47 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of scode@scode.org designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-we0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Feb 2012 08:21:39 +0000 Received: by werm10 with SMTP id m10so4316559wer.31 for ; Mon, 13 Feb 2012 00:21:17 -0800 (PST) MIME-Version: 1.0 Received: by 10.180.82.227 with SMTP id l3mr22856137wiy.1.1329121277454; Mon, 13 Feb 2012 00:21:17 -0800 (PST) Sender: scode@scode.org Received: by 10.180.18.130 with HTTP; Mon, 13 Feb 2012 00:21:17 -0800 (PST) X-Originating-IP: [67.169.39.4] In-Reply-To: References: <201202131403484201571@jike.com> Date: Mon, 13 Feb 2012 09:21:17 +0100 X-Google-Sender-Auth: V1skAFu4IOEoPJcRgeDZuGIRRL8 Message-ID: Subject: Re: keycache persisted to disk ? From: Peter Schuller To: user@cassandra.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQl3cmzolUgf4HO9D5gTaV3AM3EoZAvNmsXS9LbpA+oni0TIb0/+kyk+O7i6ezpNbD7DtV6b > I actually has the opposite 'problem'. I have a pair of servers that have > been static since mid last week, but have seen performance vary > significantly (x10) for exactly the same query. I=C2=A0hypothesised=C2=A0= it was > various caches so I shut down Cassandra, flushed the O/S buffer cache and > then bought it back up. The performance wasn't significantly different to > the pre-flush=C2=A0performance I don't get this thread at all :) Why would restarting with clean caches be expected to *improve* performance? And why is key cache loading involved other than to delay start-up and hopefully pre-populating caches for better (not worse) performance? If you want to figure out why queries seem to be slow relative to normal, you'll need to monitor the behavior of the nodes. Look at disk I/O statistics primarily (everyone reading this running Cassandra who aren't intimately familiar with "iostat -x -k 1" should go and read up on it right away; make sure you understand the utilization and avg queue size columns), CPU usage, weather compaction is happening, etc. One easy way to see sudden bursts of poor behavior is to be heavily reliant on cache, and then have sudden decreases in performance due to compaction evicting data from page cache while also generating more I/O. But that's total speculation. It is also the case that you cannot expect consistent performance on EC2 and that might be it. But my #1 advise: Log into the node while it is being slow, and observe. Figure out what the bottleneck is. iostat, top, nodetool tpstats, nodetool netstats, nodetool compactionstats. --=20 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)