Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 50998 invoked from network); 4 Nov 2010 22:20:28 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 Nov 2010 22:20:28 -0000 Received: (qmail 1199 invoked by uid 500); 4 Nov 2010 22:20:57 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 1130 invoked by uid 500); 4 Nov 2010 22:20:56 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 1113 invoked by uid 99); 4 Nov 2010 22:20:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Nov 2010 22:20:55 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [216.207.44.48] (HELO postrelay-1.pdf.com) (216.207.44.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Nov 2010 22:20:49 +0000 Received: from domo-sjc-001.pdf.com (domo-sjc-001.pdf.com [10.10.9.180]) by postrelay-1.pdf.com (Switch-3.2.5/Switch-3.2.5) with ESMTP id oA4MGw5A013129 for ; Thu, 4 Nov 2010 15:16:58 -0700 Received: from [127.0.0.1] (sjc-10-10-11-94.pdf.com [10.10.11.94]) by domo-sjc-001.pdf.com (8.13.8/8.13.8) with ESMTP id oA4MKR81006476 for ; Thu, 4 Nov 2010 15:20:27 -0700 Message-ID: <4CD331B1.50901@pdf.com> Date: Thu, 04 Nov 2010 15:20:33 -0700 From: Alaa Zubaidi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: SSD vs. HDD References: <4CD1B8EA.3050706@pdf.com> <4CD1EBF9.6040507@pdf.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the advise... We are running on Windows, and I just added more memory to my system, 16G I will run the test again with 8G heap. The load is continues, however, the CPU usage is around 40% with max of 70%. As for cache, I am not using cache, because I am under the impression that cache in my case, where the data keeps changing very quickly in and out of cache, is not a good idea? Thanks On 11/4/2010 3:14 AM, Nick Telford wrote: > If you're bottle-necking on read I/O making proper use of Cassandras key > cache and row cache will improve things dramatically. > > A little maths using the numbers you've provided tells me that you have > about 80GB of "hot" data (data valid in a 4 hour period). That's obviously > too much to directly cache, but you can probably cache some or all of the > row keys, depending on your column distribution among keys. This will > prevent reads from having to hit the indexes for the relevant sstables - > eliminating a seek per sstable. > > If you have a subset of this data that is read more than the rest, the row > cache will help you out a lot too. Have a look at your access patterns and > see if it's worthwhile caching some rows. > > If you make progress using the various caches, but don't have enough memory, > I'd explore the costs of expanding the available memory compared to > switching to SSDs as I imagine it'd be cheaper and would last longer. > > Finally, given your particular deletion pattern, it's probably worth looking > at 0.7 and upgrading once it is released as stable. CASSANDRA-699[1] adds > support for TTL columns that automatically expire and get removed (during > compaction) without the need for a manual deletion mechanism. Failing this, > since data older than 4 hours is no longer relevant, you should reduce your > GCGraceSeconds>= 4 hours. This will ensure deleted data is removed faster, > keeping your sstables smaller and allowing the fs cache to operate more > effectively. > > 1: https://issues.apache.org/jira/browse/CASSANDRA-699 > > On 4 November 2010 08:18, Peter Schullerwrote: > >>> I am having time out errors while reading. >>> I have 5 CFs but two CFs with high write/read. >>> The data is organized in time series rows, in CF1 the new rows are read >>> every 10 seconds and then the whole rows are deleted, While in CF2 the >> rows >>> are read in different time range slices and eventually deleted may be >> after >>> few hours. >> So the first thing to do is to confirm what the bottleneck is. If >> you're having timeouts on reads, and assuming your not doing reads of >> hot-in-cache data so fast that CPU is the bottleneck (and given that >> you ask about SSD), the hypothesis then is that you're disk bound due >> to seeking. >> >> Observe the node(s) and in particular use "iostat -x -k 1" (or an >> equivalent graph) and look at the %util and %avgqu-sz columns to >> confirm that you are indeed disk-bound. Unless you're doing large >> reads, you will likely see, on average, small reads in amounts that >> simply saturate underlying storage, %util at 100% and the avgu-sz will >> probably be approaching the level of concurrency of your read traffic. >> >> Now, assuming that is true, the question is why. So: >> >> (1) Are you continually saturating disk or just periodically? >> (2) If periodically, does the periods of saturation correlate with >> compaction being done by Cassandra (or for that matter something >> else)? >> (3) What is your data set size relative to system memory? What is your >> system memory and JVM heap size? (Relevant because it is important to >> look at how much memory the kernel will use for page caching.) >> >> As others have mentioned, the amount of reads done on disk for each >> read form the database (assuming data is not in cache) can be affected >> by how data is written (e.g., partial row writes etc). That is one >> thing that can be addressed, as is re-structuring data to allow >> reading more sequentially (if possible). That only helps along one >> dimension though - lessening, somewhat, the cost of cold reads. The >> gains may be limited and the real problem may be that you simply need >> more memory for caching and/or more IOPS from your storage (i.e., more >> disks, maybe SSD, etc). >> >> If on the other hand you're normally completely fine and you're just >> seeing periods of saturation associated with compaction, this may be >> mitigated by software improvements by possibly rate limiting reads >> and/or writes during compaction and avoiding buffer cache thrashing. >> There's a JIRA ticket for direct I/O >> (https://issues.apache.org/jira/browse/CASSANDRA-1470). I don't think >> there's a JIRA ticket for rate limiting, but I suspect, since you're >> doing time series data, that you're not storing very large values - >> and I would expect compaction to be CPU bound rather than being close >> to saturate disk. >> >> In either case, please do report back as it's interesting to figure >> out what kind of performance issues people are seeing. >> >> -- >> / Peter Schuller >> -- Alaa Zubaidi PDF Solutions, Inc. 333 West San Carlos Street, Suite 700 San Jose, CA 95110 USA Tel: 408-283-5639 (or 408-280-7900 x5639) fax: 408-938-6479 email: alaa.zubaidi@pdf.com