cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Burruss <bburr...@expedia.com>
Subject Re: ParNew and caching
Date Fri, 18 Nov 2011 23:12:34 GMT
After re-reading my post, what I meant to say is that I switched from
Serializing cache provider to ConcurrentLinkedHash cache provider and then
saw better performance, but still far worse than no caching at all:

- no caching at all : 25-30ms
- with Serializing provider : 1300+ms
- with Concurrent provider : 500ms

100% cache hit rate.  ParNew is the only stat that I see out of line, so
seems like still a lot of copying

On 11/18/11 2:40 PM, "Mohit Anchlia" <mohitanchlia@gmail.com> wrote:

>On Fri, Nov 18, 2011 at 1:46 PM, Todd Burruss <bburruss@expedia.com>
>wrote:
>> Ok, I figured something like that.  Switching to
>> ConcurrentLinkedHashCacheProvider I see it is a lot better, but still
>> instead of the 25-30ms response times I enjoyed with no caching, I'm
>> seeing 500ms at 100% hit rate on the cache.  No old gen pressure at all,
>> just ParNew crazy.
>>
>
>Are you saying that when you had off heap you saw better performance
>of 25-30 ms? And now it's 500ms to get 50 columns? What kind of load
>are you generating? What results do you see if you disabled row cache
>and just leave key cache on?
>
>There are lot of factors to consider so having more stats would be
>helpful.
>
>Please paste cfhistograms. Have you tried monitoring tpstats and netstats?
>
>What's your CL and RF?
>> More info on my use case is that I am picking 50 columns from the 70k.
>> Since the whole row is in the cache, and no copying from off-heap nor
>>disk
>> buffers, seems like it should be faster than non-cache mode.
>>
>> More thoughts :)
>>
>> On 11/18/11 6:39 AM, "Sylvain Lebresne" <sylvain@datastax.com> wrote:
>>
>>>On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss <bburruss@expedia.com>
>>>wrote:
>>>> I'm using cassandra 1.0.  Been doing some testing on using cass's
>>>>cache.
>>>>  When I turn it on (using the CLI) I see ParNew jump from 3-4ms to
>>>> 200-300ms.  This really screws with response times, which jump from
>>>>~25-30ms
>>>> to 1300+ms.  I've increase new gen and that helps, but still this is
>>>> suprising to me, especially since 1.0 defaults to the
>>>> SerializingCacheProvider ­ off heap.
>>>> The interesting tid bit is that I have wide rows.  70k+ columns per
>>>>row, ~50
>>>> bytes per column value.  The cache only must be about 400 rows to
>>>>catch
>>>>all
>>>> the data per node and JMX is reporting 100% cache hits.  Nodetool ring
>>>> reports < 2gb per node, my heap is 6gb and total RAM is 16gb.
>>>> Thoughts?
>>>
>>>You're problem is the mix of wide rows and the serializing cache.
>>>What happens with the serializing cache is that our data is stored
>>>out of the heap. But that means that for each read to a row, we
>>>'deserialize' the row for the out-of-heap memory into the heap to
>>>return it. The thing is, when we do that, we do the full row each
>>>time. In other word, for each query we deserialize 70k+ columns
>>>even if to return only one. I'm willing to bet this is what is killing
>>>your response time. If you want to cache wide rows, I really
>>>suggest you're using the ConcurrentLinkedHashCacheProvider
>>>instead.
>>>
>>>I'll also note that this explain the ParNew times too. Deserializing
>>>all those columns from off-heap creates lots of short-lived object,
>>>and since you deserialize 70k+ on each query, that's quite some
>>>pressure on the new gen. Note that the serializing cache is
>>>actually minimizing the use of old gen, because that is the one
>>>that is the one that can create huge GC pauses with big heap,
>>>but it actually put more pressure on the new gen. This is by
>>>design and because new gen is much less of a problem than
>>>old gen.
>>>
>>>--
>>>Sylvain
>>
>>

Mime
View raw message