Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
Message-ID: <50BE36A4.7030504@yahoo.com>
Date: Tue, 04 Dec 2012 12:45:08 -0500
From: Mike <mtheroux2@yahoo.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121026 Thunderbird/16.0.2
MIME-Version: 1.0
To: user@cassandra.apache.org
Subject: Re: Row caching + Wide row column family == almost crashed?
References: <50BBC9C6.1050007@yahoo.com> <50BD3BD7.6020605@dehora.net>
In-Reply-To: <50BD3BD7.6020605@dehora.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit

Thanks for all the responses!

On 12/3/2012 6:55 PM, Bill de h�ra wrote:
> A Cassandra JVM will generally not function well with with caches and 
> wide rows. Probably the most important thing to understand is Ed's 
> point, that the row cache caches the entire row, not just the slice 
> that was read out. What you've seen is almost exactly the observed 
> behaviour I'd expect with enabling either cache provider over wide rows.
>
>  - the on-heap cache will result in evictions that crush the JVM 
> trying to manage garbage. This is also the case so if the rows have an 
> uneven size distribution (as small rows can push out a single large 
> row, large rows push out many small ones, etc).
>
>  - the off heap cache will spend a lot of time serializing and 
> deserializing wide rows, such that it can increase latency relative to 
> just reading from disk and leverage the filesystem's cache directly.
>
> The cache resizing behaviour does exist to preserve the server's 
> memory, but it can also cause a death spiral in the on-heap case, 
> because a relatively smaller cache may result in data being evicted 
> more frequently.  I've seen cases where sizing up the cache can 
> stabilise a server's memory.
>
> This isn't just a Cassandra thing, it simply happens to be very 
> evident with that system - generally to get an effective benefit from 
> a cache, the data should be contiguously sized and not too large to 
> allow effective cache 'lining'.
>
> Bill
>
> On 02/12/12 21:36, Mike wrote:
>> Hello,
>>
>> We recently hit an issue within our Cassandra based application.  We
>> have a relatively new Column Family with some very wide rows (10's of
>> thousands of columns, or more in some cases).  During a periodic
>> activity, we the range of columns to retrieve various pieces of
>> information, a segment at a time.
>>
>> We do these same queries frequently at various stages of the process,
>> and I thought the application could see a performance benefit from row
>> caching.  We have a small row cache (100MB per node) already enabled,
>> and I enabled row caching on the new column family.
>>
>> The results were very negative.  When performing range queries with a
>> limit of 200 results, for a small minority of the rows in the new column
>> family, performance plummeted.  CPU utilization on the Cassandra node
>> went through the roof, and it started chewing up memory.  Some queries
>> to this column family hung completely.
>>
>> According to the logs, we started getting frequent GCInspector
>> messages.  Cassandra started flushing the largest mem_tables due to
>> hitting the "flush_largest_memtables_at" of 75%, and scaling back the
>> key/row caches.  However, to Cassandra's credit, it did not die with an
>> OutOfMemory error.  Its measures to emergency measures to conserve
>> memory worked, and the cluster stayed up and running.  No real errors
>> showed in the logs, except for Messages getting drop, which I believe
>> was caused by what was going on with CPU and memory.
>>
>> Disabling row caching on this new column family has resolved the issue
>> for now, but, is there something fundamental about row caching that I am
>> missing?
>>
>> We are running Cassandra 1.1.2 with a 6 node cluster, with a replication
>> factor of 3.
>>
>> Thanks,
>> -Mike
>>
>>
>