cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Goffinet <goffi...@digg.com>
Subject Re: Persistently increasing read latency
Date Thu, 03 Dec 2009 19:26:09 GMT
Tim,

After you stop the test. Do you see the pending tasks for compaction  
drop? Need to verify you didn't run into a new bug. If the number  
starts to slowly drop that just indicates that compactions are not  
keeping up with your write levels.

On Dec 3, 2009, at 11:11 AM, Freeman, Tim wrote:

>> how many are in your data directories?  is your compaction
>> lagging 1000s of tables behind again?
>
> Yes, there are 2348 files in data/Keyspace1, and jconsole says the  
> compaction pool has >1600 pending tasks.
>
> Chris Goffinet's questions were good too but it will take me a  
> little while to get answers.
>
> Tim Freeman
> Email: tim.freeman@hp.com
> Desk in Palo Alto: (650) 857-2581
> Home: (408) 774-1298
> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday,  
> and Thursday; call my desk instead.)
>
>
> -----Original Message-----
> From: Jonathan Ellis [mailto:jbellis@gmail.com]
> Sent: Thursday, December 03, 2009 11:02 AM
> To: cassandra-user@incubator.apache.org
> Subject: Re: Persistently increasing read latency
>
> i would expect read latency to increase linearly w/ the number of
> sstables you have around.  how many are in your data directories?  is
> your compaction lagging 1000s of tables behind again?
>
> On Thu, Dec 3, 2009 at 12:58 PM, Freeman, Tim <tim.freeman@hp.com>  
> wrote:
>> I ran another test last night with the build dated 29 Nov 2009.   
>> Other than the Cassandra version, the setup was the same as  
>> before.  I got qualitatively similar results as before, too -- the  
>> read latency increased fairly smoothly from 250ms to 1s, the GC  
>> times reported by jconsole are low, the pending tasks for row- 
>> mutation-stage and row-read-stage are less than 10, the pending  
>> tasks for the compaction pool are 1615.  Last time around the read  
>> latency maxed out at one second.  This time, it just got to one  
>> second as I'm writing this so I don't know yet if it will continue  
>> to increase.
>>
>> I have attached a fresh graph describing the present run.  It's  
>> qualitatively similar to the previous one.  The vertical units are  
>> milliseconds (for latency) and operations per minute (for reads or  
>> writes).  The horizontal scale is seconds.  The feature that's  
>> bothering me is the red line for the read latency going diagonally  
>> from lower left to the lower-middle right.  The scale doesn't make  
>> it look dramatic, but Cassandra slowed down by a factor of 4.
>>
>> The read and write rates were stable for 45,000 seconds or so, and  
>> then the read latency got big enough that the application was  
>> starved for reads and it started writing less.
>>
>> If this is worth pursuing, I suppose the next step would be for me  
>> to make a small program that reproduces the problem.  It should be  
>> easy -- we're just reading and writing random records.  Let me know  
>> if there's interest in that.  I could  also decide to live with a  
>> 1000 ms latency here.  I'm thinking of putting a cache in the local  
>> filesystem in front of Cassandra (or whichever distributed DB we  
>> decide to go with), so living with it is definitely possible.
>>
>> Tim Freeman
>> Email: tim.freeman@hp.com
>> Desk in Palo Alto: (650) 857-2581
>> Home: (408) 774-1298
>> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday,  
>> and Thursday; call my desk instead.)
>>
>> -----Original Message-----
>> From: Jonathan Ellis [mailto:jbellis@gmail.com]
>> Sent: Tuesday, December 01, 2009 11:10 AM
>> To: cassandra-user@incubator.apache.org
>> Subject: Re: Persistently increasing read latency
>>
>> 1) use jconsole to see what is happening to jvm / cassandra  
>> internals.
>>  possibly you are slowly exceeding cassandra's ability to keep up  
>> with
>> writes, causing the jvm to spend more and more effort GCing to find
>> enough memory to keep going
>>
>> 2) you should be at least on 0.4.2 and preferably trunk if you are
>> stress testing
>>
>> -Jonathan
>>
>> On Tue, Dec 1, 2009 at 12:11 PM, Freeman, Tim <tim.freeman@hp.com>  
>> wrote:
>>> In an 8 hour test run, I've seen the read latency for Cassandra  
>>> drift fairly linearly from ~460ms to ~900ms.  Eventually my  
>>> application gets starved for reads and starts misbehaving.  I have  
>>> attached graphs -- horizontal scales are seconds, vertical scales  
>>> are operations per minute and average milliseconds per operation.   
>>> The clearest feature is the light blue line in the left graph  
>>> drifting consistently upward during the run.
>>>
>>> I have a Cassandra 0.4.1 database, one node, records are 100kbytes  
>>> each, 350K records, 8 threads reading, around 700 reads per  
>>> minute.  There are also 8 threads writing.  This is all happening  
>>> on a 4 core processor that's supporting both the Cassandra node  
>>> and the code that's generating load for it.  I'm reasonably sure  
>>> that there are no page faults.
>>>
>>> I have attached my storage-conf.xml.  Briefly, it has default  
>>> values, except RpcTimeoutInMillis is 30000 and the partitioner is  
>>> OrderPreservingPartitioner.  Cassandra's garbage collection  
>>> parameters are:
>>>
>>>   -Xms128m -Xmx1G -XX:SurvivorRatio=8 -XX:+AggressiveOpts -XX: 
>>> +UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>
>>> Is this normal behavior?  Is there some change to the  
>>> configuration I should make to get it to stop getting slower?  If  
>>> it's not normal, what debugging information should I gather?   
>>> Should I give up on Cassandra 0.4.1 and move to a newer version?
>>>
>>> I'll leave it running for the time being in case there's something  
>>> useful to extract from it.
>>>
>>> Tim Freeman
>>> Email: tim.freeman@hp.com
>>> Desk in Palo Alto: (650) 857-2581
>>> Home: (408) 774-1298
>>> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday,  
>>> and Thursday; call my desk instead.)
>>>
>>>
>>


Mime
View raw message