incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <mohitanch...@gmail.com>
Subject Re: ParNew and caching
Date Fri, 18 Nov 2011 17:58:11 GMT
On Fri, Nov 18, 2011 at 9:42 AM, Sylvain Lebresne <sylvain@datastax.com> wrote:
> On Fri, Nov 18, 2011 at 6:31 PM, Mohit Anchlia <mohitanchlia@gmail.com> wrote:
>> On Fri, Nov 18, 2011 at 7:47 AM, Sylvain Lebresne <sylvain@datastax.com> wrote:
>>> On Fri, Nov 18, 2011 at 4:23 PM, Mohit Anchlia <mohitanchlia@gmail.com>
wrote:
>>>> On Fri, Nov 18, 2011 at 6:39 AM, Sylvain Lebresne <sylvain@datastax.com>
wrote:
>>>>> On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss <bburruss@expedia.com>
wrote:
>>>>>> I'm using cassandra 1.0.  Been doing some testing on using cass's
cache.
>>>>>>  When I turn it on (using the CLI) I see ParNew jump from 3-4ms
to
>>>>>> 200-300ms.  This really screws with response times, which jump from
~25-30ms
>>>>>> to 1300+ms.  I've increase new gen and that helps, but still this
is
>>>>>> suprising to me, especially since 1.0 defaults to the
>>>>>> SerializingCacheProvider – off heap.
>>>>>> The interesting tid bit is that I have wide rows.  70k+ columns
per row, ~50
>>>>>> bytes per column value.  The cache only must be about 400 rows to
catch all
>>>>>> the data per node and JMX is reporting 100% cache hits.  Nodetool
ring
>>>>>> reports < 2gb per node, my heap is 6gb and total RAM is 16gb.
>>>>>> Thoughts?
>>>>>
>>>>> You're problem is the mix of wide rows and the serializing cache.
>>>>> What happens with the serializing cache is that our data is stored
>>>>> out of the heap. But that means that for each read to a row, we
>>>>> 'deserialize' the row for the out-of-heap memory into the heap to
>>>>> return it. The thing is, when we do that, we do the full row each
>>>>> time. In other word, for each query we deserialize 70k+ columns
>>>>> even if to return only one. I'm willing to bet this is what is killing
>>>>> your response time. If you want to cache wide rows, I really
>>>>> suggest you're using the ConcurrentLinkedHashCacheProvider
>>>>> instead.
>>>>
>>>> What happens when using ConcurrentLinkedHashCache? What is the
>>>> implementation like and why is it better?
>>>
>>> With ConcurrentLinkedHashCache, the cache is in the heap. So there
>>> is no deserialization/copy during gets, so having wide rows is not a
>>> problem. Outside of the fact that if you're enabling cache on a column
>>> family with wide rows, you have to keep in mind that we always keep
>>> full rows in cache.
>>>
>>
>> Wouldn't it move the problem to GC pauses from not being able to clean
>> up old generation? I am using these rows in concurrenthashmap will get
>> migrated to old gen.
>
> Kinda, yes, that's why we have a serializing cache :)
>
> I mean, caching rows of 70k+ columns is *not* the typical case we've
> optimized for (https://issues.apache.org/jira/browse/CASSANDRA-1956
> should improve here) and so yes neither the serializing cache nor the linked
> hash one will be perfect in that case. But the serializing cache is just worst
> in that specific case.

Thanks! This makes sense.

>
> --
> Sylvain
>
>>>>
>>>>>
>>>>> I'll also note that this explain the ParNew times too. Deserializing
>>>>> all those columns from off-heap creates lots of short-lived object,
>>>>> and since you deserialize 70k+ on each query, that's quite some
>>>>> pressure on the new gen. Note that the serializing cache is
>>>>> actually minimizing the use of old gen, because that is the one
>>>>> that is the one that can create huge GC pauses with big heap,
>>>>> but it actually put more pressure on the new gen. This is by
>>>>> design and because new gen is much less of a problem than
>>>>> old gen.
>>>>
>>>> In this scenario would it help if Young generation space is increased?
>>>
>>> That's a hard one to answer because GC tuning is a bit of a black
>>> art, when testing and benchmarking is often key. Having a bigger
>>> young generation means having young collection kicked less often
>>> but on the other side it reduces the size for the old generation.
>>> But again, I don't think the problem is really the GC here, at least not
>>> primarily.
>>>
>>> --
>>> Sylvain
>>>
>>>>
>>>>>
>>>>> --
>>>>> Sylvain
>>>>>
>>>>
>>>
>>
>

Mime
View raw message