Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
Message-ID: <4CD331B1.50901@pdf.com>
Date: Thu, 04 Nov 2010 15:20:33 -0700
From: Alaa Zubaidi <alaa.zubaidi@pdf.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
 rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: user@cassandra.apache.org
Subject: Re: SSD vs. HDD
References: <4CD1B8EA.3050706@pdf.com>
	<AANLkTik=fmr-zpdNL3P+GZAJfwCoxTVmFq+Y+97Motab@mail.gmail.com>
	<AANLkTikr57Ozc7Qot347TM6hozS0mj5566f_PQ6GXSDi@mail.gmail.com>
	<4CD1EBF9.6040507@pdf.com>
	<AANLkTimCDmrHOnfDwZSLkmkOdS2TzJ0Q3ORPWdrpJqeR@mail.gmail.com>
 <AANLkTinovrPgAcroRtECEAvgGyRn8a3rkk6KsWcRmS39@mail.gmail.com>
In-Reply-To: <AANLkTinovrPgAcroRtECEAvgGyRn8a3rkk6KsWcRmS39@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Thanks for the advise...
We are running on Windows, and I just added more memory to my system, 
16G I will run the test again with 8G heap.
The load is continues, however, the CPU usage is around 40% with max of 70%.
As for cache, I am not using cache, because I am under the impression 
that cache in my case, where the data keeps changing very quickly in and 
out of cache, is not a good idea?
Thanks

On 11/4/2010 3:14 AM, Nick Telford wrote:
> If you're bottle-necking on read I/O making proper use of Cassandras key
> cache and row cache will improve things dramatically.
>
> A little maths using the numbers you've provided tells me that you have
> about 80GB of "hot" data (data valid in a 4 hour period). That's obviously
> too much to directly cache, but you can probably cache some or all of the
> row keys, depending on your column distribution among keys. This will
> prevent reads from having to hit the indexes for the relevant sstables -
> eliminating a seek per sstable.
>
> If you have a subset of this data that is read more than the rest, the row
> cache will help you out a lot too. Have a look at your access patterns and
> see if it's worthwhile caching some rows.
>
> If you make progress using the various caches, but don't have enough memory,
> I'd explore the costs of expanding the available memory compared to
> switching to SSDs as I imagine it'd be cheaper and would last longer.
>
> Finally, given your particular deletion pattern, it's probably worth looking
> at 0.7 and upgrading once it is released as stable. CASSANDRA-699[1] adds
> support for TTL columns that automatically expire and get removed (during
> compaction) without the need for a manual deletion mechanism. Failing this,
> since data older than 4 hours is no longer relevant, you should reduce your
> GCGraceSeconds>= 4 hours. This will ensure deleted data is removed faster,
> keeping your sstables smaller and allowing the fs cache to operate more
> effectively.
>
> 1: https://issues.apache.org/jira/browse/CASSANDRA-699
>
> On 4 November 2010 08:18, Peter Schuller<peter.schuller@infidyne.com>wrote:
>
>>> I am having time out errors while reading.
>>> I have 5 CFs but two CFs with high write/read.
>>> The data is organized in time series rows, in CF1 the new rows are read
>>> every 10 seconds and then the whole rows are deleted, While in CF2 the
>> rows
>>> are read in different time range slices and eventually deleted may be
>> after
>>> few hours.
>> So the first thing to do is to confirm what the bottleneck is. If
>> you're having timeouts on reads, and assuming your not doing reads of
>> hot-in-cache data so fast that CPU is the bottleneck (and given that
>> you ask about SSD), the hypothesis then is that you're disk bound due
>> to seeking.
>>
>> Observe the node(s) and in particular use "iostat -x -k 1" (or an
>> equivalent graph) and look at the %util and %avgqu-sz columns to
>> confirm that you are indeed disk-bound. Unless you're doing large
>> reads, you will likely see, on average, small reads in amounts that
>> simply saturate underlying storage, %util at 100% and the avgu-sz will
>> probably be approaching the level of concurrency of your read traffic.
>>
>> Now, assuming that is true, the question is why. So:
>>
>> (1) Are you continually saturating disk or just periodically?
>> (2) If periodically, does the periods of saturation correlate with
>> compaction being done by Cassandra (or for that matter something
>> else)?
>> (3) What is your data set size relative to system memory? What is your
>> system memory and JVM heap size? (Relevant because it is important to
>> look at how much memory the kernel will use for page caching.)
>>
>> As others have mentioned, the amount of reads done on disk for each
>> read form the database (assuming data is not in cache) can be affected
>> by how data is written (e.g., partial row writes etc). That is one
>> thing that can be addressed, as is re-structuring data to allow
>> reading more sequentially (if possible). That only helps along one
>> dimension though - lessening, somewhat, the cost of cold reads. The
>> gains may be limited and the real problem may be that you simply need
>> more memory for caching and/or more IOPS from your storage (i.e., more
>> disks, maybe SSD, etc).
>>
>> If on the other hand you're normally completely fine and you're just
>> seeing periods of saturation associated with compaction, this may be
>> mitigated by software improvements by possibly rate limiting reads
>> and/or writes during compaction and avoiding buffer cache thrashing.
>> There's a JIRA ticket for direct I/O
>> (https://issues.apache.org/jira/browse/CASSANDRA-1470). I don't think
>> there's a JIRA ticket for rate limiting, but I suspect, since you're
>> doing time series data, that you're not storing very large values -
>> and I would expect compaction to be CPU bound rather than being close
>> to saturate disk.
>>
>> In either case, please do report back as it's interesting to figure
>> out what kind of performance issues people are seeing.
>>
>> --
>> / Peter Schuller
>>

-- 
Alaa Zubaidi
PDF Solutions, Inc.
333 West San Carlos Street, Suite 700
San Jose, CA 95110  USA
Tel: 408-283-5639 (or 408-280-7900 x5639)
fax: 408-938-6479
email: alaa.zubaidi@pdf.com