cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Freeman, Tim" <tim.free...@hp.com>
Subject RE: Persistently increasing read latency
Date Thu, 03 Dec 2009 19:11:30 GMT
>how many are in your data directories?  is your compaction 
>lagging 1000s of tables behind again?

Yes, there are 2348 files in data/Keyspace1, and jconsole says the compaction pool has >1600
pending tasks.

Chris Goffinet's questions were good too but it will take me a little while to get answers.

Tim Freeman
Email: tim.freeman@hp.com
Desk in Palo Alto: (650) 857-2581
Home: (408) 774-1298
Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday; call my desk
instead.)


-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
Sent: Thursday, December 03, 2009 11:02 AM
To: cassandra-user@incubator.apache.org
Subject: Re: Persistently increasing read latency

i would expect read latency to increase linearly w/ the number of
sstables you have around.  how many are in your data directories?  is
your compaction lagging 1000s of tables behind again?

On Thu, Dec 3, 2009 at 12:58 PM, Freeman, Tim <tim.freeman@hp.com> wrote:
> I ran another test last night with the build dated 29 Nov 2009.  Other than the Cassandra
version, the setup was the same as before.  I got qualitatively similar results as before,
too -- the read latency increased fairly smoothly from 250ms to 1s, the GC times reported
by jconsole are low, the pending tasks for row-mutation-stage and row-read-stage are less
than 10, the pending tasks for the compaction pool are 1615.  Last time around the read latency
maxed out at one second.  This time, it just got to one second as I'm writing this so I don't
know yet if it will continue to increase.
>
> I have attached a fresh graph describing the present run.  It's qualitatively similar
to the previous one.  The vertical units are milliseconds (for latency) and operations per
minute (for reads or writes).  The horizontal scale is seconds.  The feature that's bothering
me is the red line for the read latency going diagonally from lower left to the lower-middle
right.  The scale doesn't make it look dramatic, but Cassandra slowed down by a factor of
4.
>
> The read and write rates were stable for 45,000 seconds or so, and then the read latency
got big enough that the application was starved for reads and it started writing less.
>
> If this is worth pursuing, I suppose the next step would be for me to make a small program
that reproduces the problem.  It should be easy -- we're just reading and writing random
records.  Let me know if there's interest in that.  I could  also decide to live with a
1000 ms latency here.  I'm thinking of putting a cache in the local filesystem in front of
Cassandra (or whichever distributed DB we decide to go with), so living with it is definitely
possible.
>
> Tim Freeman
> Email: tim.freeman@hp.com
> Desk in Palo Alto: (650) 857-2581
> Home: (408) 774-1298
> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday; call
my desk instead.)
>
> -----Original Message-----
> From: Jonathan Ellis [mailto:jbellis@gmail.com]
> Sent: Tuesday, December 01, 2009 11:10 AM
> To: cassandra-user@incubator.apache.org
> Subject: Re: Persistently increasing read latency
>
> 1) use jconsole to see what is happening to jvm / cassandra internals.
>  possibly you are slowly exceeding cassandra's ability to keep up with
> writes, causing the jvm to spend more and more effort GCing to find
> enough memory to keep going
>
> 2) you should be at least on 0.4.2 and preferably trunk if you are
> stress testing
>
> -Jonathan
>
> On Tue, Dec 1, 2009 at 12:11 PM, Freeman, Tim <tim.freeman@hp.com> wrote:
>> In an 8 hour test run, I've seen the read latency for Cassandra drift fairly linearly
from ~460ms to ~900ms.  Eventually my application gets starved for reads and starts misbehaving.
 I have attached graphs -- horizontal scales are seconds, vertical scales are operations
per minute and average milliseconds per operation.  The clearest feature is the light blue
line in the left graph drifting consistently upward during the run.
>>
>> I have a Cassandra 0.4.1 database, one node, records are 100kbytes each, 350K records,
8 threads reading, around 700 reads per minute.  There are also 8 threads writing.  This
is all happening on a 4 core processor that's supporting both the Cassandra node and the code
that's generating load for it.  I'm reasonably sure that there are no page faults.
>>
>> I have attached my storage-conf.xml.  Briefly, it has default values, except RpcTimeoutInMillis
is 30000 and the partitioner is OrderPreservingPartitioner.  Cassandra's garbage collection
parameters are:
>>
>>   -Xms128m -Xmx1G -XX:SurvivorRatio=8 -XX:+AggressiveOpts -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
>>
>> Is this normal behavior?  Is there some change to the configuration I should make
to get it to stop getting slower?  If it's not normal, what debugging information should
I gather?  Should I give up on Cassandra 0.4.1 and move to a newer version?
>>
>> I'll leave it running for the time being in case there's something useful to extract
from it.
>>
>> Tim Freeman
>> Email: tim.freeman@hp.com
>> Desk in Palo Alto: (650) 857-2581
>> Home: (408) 774-1298
>> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday;
call my desk instead.)
>>
>>
>

Mime
View raw message